**SpringerBriefs in Earth System Sciences Gaël Kermarrec · Vibeke Skytt · Tor Dokken**

# **Optimal Surface Fitting of Point Clouds Using Local Refinement** Application to GIS Data

# **SpringerBriefs in Earth System Sciences**

#### **Series Editors**

Gerrit Lohmann, Universität Bremen, Bremen, Germany

Lawrence A. Mysak, Department of Atmospheric and Oceanic Science, McGill University, Montreal, QC, Canada

Justus Notholt, Institute of Environmental Physics, University of Bremen, Bremen, Germany

Jorge Rabassa, Labaratorio de Geomorfología y Cuaternar, CADIC-CONICET, Ushuaia, Tierra del Fuego, Argentina

Vikram Unnithan, Department of Earth and Space Sciences, Jacobs University Bremen, Bremen, Germany

SpringerBriefs in Earth System Sciences present concise summaries of cutting-edge research and practical applications. The series focuses on interdisciplinary research linking the lithosphere, atmosphere, biosphere, cryosphere, and hydrosphere building the system earth. It publishes peer-reviewed monographs under the editorial supervision of an international advisory board with the aim to publish 8 to 12 weeks after acceptance. Featuring compact volumes of 50 to 125 pages (approx. 20,000—70,000 words), the series covers a range of content from professional to academic such as:


Briefs are published as part of Springer's eBook collection, with millions of users worldwide. In addition, Briefs are available for individual print and electronic purchase. Briefs are characterized by fast, global electronic dissemination, standard publishing contracts, easy-to-use manuscript preparation and formatting guidelines, and expedited production schedules.

Both solicited and unsolicited manuscripts are considered for publication in this series.

# Optimal Surface Fitting of Point Clouds Using Local Refinement

Application to GIS Data

Gaël Kermarrec Institute for Meteorology and Climatology Leibniz University Hannover Hanover, Germany

Tor Dokken Mathematics and Cybernetics SINTEF Digital Oslo, Norway

Vibeke Skytt Mathematics and Cybernetics SINTEF Digital Oslo, Norway

ISSN 2191-589X ISSN 2191-5903 (electronic) SpringerBriefs in Earth System Sciences ISBN 978-3-031-16953-3 ISBN 978-3-031-16954-0 (eBook) https://doi.org/10.1007/978-3-031-16954-0

© The Author(s) 2023. This book is an open access publication.

**Open Access** This book is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this book are included in the book's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the book's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

The publisher, the authors, and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or the editors give a warranty, expressed or implied, with respect to the material contained herein or for any errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland

# **Foreword**

Splines are piecewise polynomial functions and are ubiquitous in the sciences. They play a prominent role in computer-aided geometric design, signal processing, data analysis, visualization, numerical simulation, probability, and many more.

In one variable, splines are usually represented in terms of so-called B-splines. B-splines turn out to be the most useful spline basis functions because they possess several properties that are important from both theoretical and computational point of view. Even though the concept of B-splines was already known before, the modern B-spline theory roots in the seminal works by Isaac Jacob Schoenberg in the midtwentieth century and has many important developments ever since. When moving to multiple variables, tensor-product B-spline representations are the most common choice thanks to their simplicity of construction and the inheritance of all nice properties of the univariate representations. But they have limitations, and therefore, several alternative spline technologies have been proposed more recently for dealing with more general domains, unstructured partitions, and local refinement.

Oslo has a long-standing tradition in spline research, with several fundamental contributions. The so-called Oslo algorithm (1980) is probably one of the most famous outcomes of this research, a general method for inserting degrees of freedom to a given B-spline curve. A recent contribution is the creation of "LR B-splines" (2013), an elegant theory for performing local refinement on B-spline surfaces or volumes. Tor Dokken and the Geometry Group at SINTEF are a driving force behind the development and application of LR B-splines and their promotion within both academia and industry. His endless enthusiasm and encouragement are a great stimulus for all researchers who are working—or will work—on the topic.

This book will introduce you in the world of LR B-splines, with a particular focus on their application to surface fitting. Finding a compact surface description of scattered observations is an important problem. LR B-splines could provide an excellent tool for this. Adaptive refinement allows for local updates of the LR Bspline surface where the distance to the point cloud exceeds a prescribed tolerance, without introducing a huge number of redundant degrees of freedom. The theoretical framework is illustrated with practical examples of bathymetry and landslides GIS datasets. It is a comprehensive source for everyone interested in adaptive surface fitting and splines.

Rome, Italy April 2022

Hendrik Speleers

# **Acknowledgements**

Dr. Gaël Kermarrec is supported by the Deutsche Forschungsgemeinschaft under the project KE2453/2-1. This contribution proposes innovative solutions to perform the surface approximation of noisy and correlated point clouds from terrestrial laser scanner, with the goal to improve spatio-temporal monitoring of object changes based on mathematical surfaces. This publication was financially supported by the Leibniz University Hannover's Open Access Publishing Fund.

The work of Chief Scientist Dr. Tor Dokken (SINTEF) and Senior Scientist Vibeke Skytt (SINTEF) on LR B-splines for the representation, processing, and analysis of big GIS point clouds presented in this SpringerBrief has been financially supported by the EU through the FP7 project IQmulus (FP7-ICT-2011-318787) (2012–2016) and by the Research Council of Norway through the IKTPluss project ANALYST (Contract no. 270922)(2017–2021).

The authors would like to express their most sincere thanks to Daniel Czerwonka-Schröder for having shared the landslide dataset used in Chap. 6. This measurement campaign was supported by the Research Fund of the European Union for Coal and Steel [RFCS project number 800689 (2018)]. Ke Lu is further gratefully thanked for having implemented and performed some simulations on the AIC in Chap. 4.

We also want to warmly thank Henning Sundby from the Norwegian Hydrographic Service (a division of the Norwegian Mapping Authority) for pointing us which datasets to use for testing the spline technology presented in this SpringerBrief. He has also been an excellent dialogue partner with respect to the challenges and possible technological solutions for handing of large ocean floor and sea bottom datasets, as well as the use of spline technologies to address these challenges.

Senior Scientist Dr. Oliver Barrowclought from SINTEF has been a driving force for improving the text of this SpringerBrief, for which he deserves a warm thanks.

# **Highlights of the SpringerBrief**


# **About This Book**

With the development of high-rate sensors based on light detection and ranging technologies, geospatial data representing terrains and seabeds contains millions of points. Performing a surface approximation is an efficient way to reduce, smooth, and structure the recorded data. Prominent applications are the analysis of spatiotemporal deformation, or the drawing of contour lines.

In this SpringerBrief, we dive into the concept of Locally Refined (LR) B-splines to approximate point clouds. We describe both intuitively and mathematically how local adaptive refinement is performed and highlight its advantages over other methods. Various examples using datasets from a sonar, a terrestrial laser scanner, and a hand-scanner illustrate the methodology. A suitable procedure to deal with outliers and voids within the context of surface approximation is proposed. We conclude by highlighting the potential of LR B-splines surfaces and volumes to perform spatio-temporal geomorphological or geodetic deformation analysis, as promising applications.

This SpringerBrief is written for a wide audience: from a practitioner wishing to perform surface approximation of point clouds, to mathematicians interested in understanding the concepts of local refinement and their potential applications.

# **Contents**



#### Contents xv


# **About the Authors**

**Dr. Gaël Kermarrec** received her M.S. (2000) in geodesy from the Ecole Nationale des Sciences Geographiques in Marne la Vallee, France. She made her promotion at the Karlsruhe Institute of Technology (KIT) in Germany in 2016, with a focus on GPS correlations modelling. Her research topics are surface modelling of point clouds from terrestrial laser scanners, as well as correlation analysis. Her main field of interest is physical modelling of correlations due to atmospheric turbulence.

**Vibeke Skytt M.Sc.** graduated from the University of Oslo in 1986 with a Master of Science in numerical analysis and has since then worked as a researcher and senior researcher at SINTEF in Oslo. Her field is geometry with emphasis on splines, with applications in various areas such as CAD, reverse engineering, and modelling for isogeometric analysis. She has, in recent years, worked with representation of geographic data with locally refined splines.

**Dr. Tor Dokken** received his Dr. Philos degree from the University of Oslo in 1997. He is Chief Scientist and Research Manager for the Geometry Group in SINTEF Digital, has been actively participating in the SIAM Activity Group for Geometric Design, and is one of the organizers of the Dagstuhl Seminars on Geometric Modelling: Interoperability and New Challenge. Coordinating a series of EU-funded projects addressing design and simulation, isogeometric analysis, additive manufacturing, and GIS Big Data is and has been one of his key activities. He is, in the period 2020–2024, the coordinator for the H2020 Innovation Action Change2Twin addressing digital twins for manufacturing and part of a parallel Innovation Action PULSATE addressing advanced laser based and additive manufacturing.

# **Acronyms**


# **Chapter 1 Introduction**

**Abstract** With the development of high rate sensors based on LIDAR (light detection and ranging) and sonar technology, geospatial data representing terrain or seabed often contains millions of points. Performing a surface approximation of the point clouds is an elegant way to reduce noisy and unorganized data to a mathematical surface with just a few coefficients to estimate. Traditional spline surfaces are able to compactly represent smooth shapes, but lack the ability to adapt the representation locally to the point clouds. Locally Refined (LR) B-spline surfaces address that challenge as they have the nice property of being locally refinable. Their format can be made compatible with most Geographic Information System (GIS) software, and they facilitate various applications such as the drawing of contour lines or spatiotemporal deformation analysis. This introduction aims to explain the need for surface approximation, and present the state of the art in that domain. We compare the LR B-spline approach with different methods for surface approximation including raster, and triangular irregular networks.

**Keywords** Geospatial data · LR B-Spline surfaces · Approximation · Surfaces in GIS

# **1.1 The Why and How of Surface Approximation**

The advance of contactless laser range scanners, either terrestrial, airborne or underwater as well as sonars, enables to capture 3D data of large areas rapidly and with a high accuracy [Weh99, Eno19]. The applications of such sensors are diverse, going from forest inventory to agricultural monitoring, deformation analysis of bridges and dams, underwater or seafloor shell fragment characterization but also cultural heritage, to name only but a few (see, e.g., [Muk16] or [Wu22] for a review of applications). While working directly with the recorded point clouds may be adequate for visualization, or animation purposes, the manipulation of millions of points is less attractive as soon as shape analysis is needed [Flo02]. Within the context of geospatial data approximation or reverse engineering [Raj08], it is favorable to convert the observations to a mathematical surface. Here the latter are defined by parametric

<sup>©</sup> The Author(s) 2023

G. Kermarrec et al., *Optimal Surface Fitting of Point Clouds Using Local Refinement*, SpringerBriefs in Earth System Sciences, https://doi.org/10.1007/978-3-031-16954-0\_1

equations and approximate data by minimizing the distance between the point clouds and the approximated surface.

Throughout the book, we define an "approximation" as a counterpart to mathematical "interpolation" for which the resulting surface passes through all the data points (see, e.g., [Fol86]). In the context of Geographic Information Systems (GIS), the term "spatial interpolation" is given a more broad definition. It is the process of using points with known values to estimate values at other points. Spatial interpolation can further be divided into exact or inexact interpolation. We will reserve the term "interpolation" to mean an exact fitting of a set of data points while inexact fitting will be denoted "approximation". Interpolation is unfavorable when a huge number of points is available. The approximation of noisy, unstructured and scattered point clouds **transforms data to information**: The resulting surfaces are less redundant and complex than when interpolation is performed.

Rigorous statistical testing of the deformation of objects such as bridges, dams or tunnels with underlying safety applications, can be best performed with mathematical surfaces. Within a geodetic context, they further make huge point clouds easy to handle and manipulate. Unfortunately, many practitioners are hesitating to use parametric surfaces to approximate their data, expressing concerns such as "*is it accurate enough?*", "*is it time consuming*?", "*I don't understand formulas*". Thus, the use of mathematical approximations of point clouds can only grow if easy-to-use and easy-to-understand approximation methods are proposed. The following criteria have to be considered:


The approach we follow in this SpringerBrief is to use Locally Refined B-spline surfaces, abbreviated as LR B-spline surfaces, for approximating geospatial data [Dok13, Sky15].

## **1.2 Surface Representation of Geospatial Data**

There are many other data representations in addition to LR B-spline surfaces that can be used to approximate point clouds. In the following sections, we present two prominent examples: Raster representation and triangulated irregular network (TIN). We provide a short comparison of these methods with LR B-spline surface fitting. Other representations such as radial basis functions used, e.g., for gravity field modelization [Ten08] and trend surfaces are not directly applicable for approximation of large datasets and therefore omitted. We note that in GIS the term "spline" most often refers to splines in tension or regularized splines, which differ from tensor product (TP) B-splines surfaces. In Computer Aided Design (CAD) a rational version of TP B-spline surfaces is used, the so-called Non Uniform Rational B-splines (NURBS) surfaces.

## *1.2.1 Raster Representation*

The raster representation is the most frequently used data format in GIS, [Bis18]. The digital elevation model (DEM) is often represented as a raster. The raster is an approximate representation as the scattered input data are not exactly fitted. A given cell contains a single value, often the elevation, so that the level of detail is restricted by the raster cell resolution. If this resolution is low compared to the variation in the data, meaning that there is a large height difference between points in a cell, then the result may be inaccurate. On the contrary, if the resolution is too high, the data volume grows more than necessary. A trade-off between accuracy and data volume must be made, and this is particularly mandatory when there are large differences in the local variation of the data in different areas. However, the raster remains a compact, highly structured and efficient representation. Proposals have been made based on a compact data structure to access a given datum or portion of the data more rapidly, [Sil21].

Figure 1.1a shows a cloud of 999,751 points consisting of classified ground points and points from the sea surface. A raster representation with 1 m resolution is shown in Fig. 1.1b. The raster is computed using inverse distance weighting (IDW, [She68]). It has size 800 × 600 and the average number of points used for estimating the

**Fig. 1.1** Raster representation of terrain. **a** Point cloud from Fjøløy in Norway, **b** raster representation visualized with Python Rasterio

**Fig. 1.2** A small triangulated point set. The data points are red and the triangulation edges are shown in blue

raster points is 5676.79. The accuracy of this raster representation is addressed in Sect. 1.2.5.1.

Several approaches are available to estimate values between the existing samples. We mention the selection in the cell center or the bivariate evaluation. Alternatively, the estimated value can be computed from a bivariate surface interpolating the four surrounding sample values. Kriging or IDW can be also used in this context, see [Oli90] or [She68]. The reader is referred to, e.g., [Wis11], [Mit05] or [Fis06] for more details and specific comparisons between methods.

## *1.2.2 Triangulated Irregular Network (TIN)*

A TIN is a continuous surface representation frequently used in GIS. This is a flexible format for geospatial data that allows adaptation to local variations, and is highly accurate. Similarly to raster representations and LR B-splines, an approximation is required to restrict the data size. The nodes of a TIN are distributed variably to create an accurate representation of the terrain. TINs can, thus, have a higher resolution in areas where a surface is highly variable or where more detail is desired and a lower resolution in areas that are less variable. They are typically used for high-precision modelling of smaller areas. In [Nel94] a triangulated surface is used to represent drainage-basins while hydrological similarity is used in the TIN creation in [Viv04].

TINs have a more complex data structure than raster surfaces and tend to be more expensive to build and process. Points in-between the corner points in a triangle are calculated by linear interpolation. This can give a jagged appearance of the surface. The problem is especially visible at sharp or nearly sharp edges, but can be remedied by methods like constrained Delaunay triangulation. A comprehensive discussion on various aspects with triangulation in terrain modelling is presented in [Li09].

The triangulation shown in Fig. 1.2 is interpolating a sparse set of terrestrial data points and created with unconstrained Delaunay triangulation. The data is fetched from LIDAR measurements of the island Fjøløy in Norway and is subsampled to improve visibility.

## *1.2.3 B-Spline Curves and Tensor Product Surfaces*

B-spline curves are piecewise polynomial curves with continuity between adjacent polynomial pieces embedded in the curve formulation. The joints between the polynomial pieces are defined by the so-called knot vector. The polynomial degree can be chosen but is often selected to be three. A B-spline curve is described as a linear combination of a set of coefficients and corresponding B-splines basis functions, see [Pie91]. The maximum possible continuity between the polynomial pieces is equal to the polynomial degree minus one.

The B-spline basis functions are themselves piecewise polynomials and have several attractive properties:


The properties of the B-splines imply that the representation is numerically stable and that a B-spline curve is bounded by its coefficients. The curve will always lie inside the polygon described by its coefficients. A B-spline curve is locally refinable, i.e., new knots can be inserted into the curve description as required.

A bivariate tensor product (TP) B-spline surface is constructed by taking the tensor product of the basis functions defined over knot vectors in two parameter directions, and defining appropriate coefficients. This construction carries over the attractive properties of non-negativity, limited support of the B-spline basis functions, partition of unity and linear independence. Unfortunately, the TP B-spline surface formulation does not allow local refinement. If a new knot is entered in one of the parameter directions, it will cover the entire parameter domain in the other direction.

## *1.2.4 Locally Refined B-Spline Surfaces*

The LR B-spline surface [Dok13] is one approach to solve the problem of lack of local refinability of TP B-spline surfaces. Other approaches include hierarchical B-splines [For88, Bra18], its variation called Truncated Hierarchical B-splines [Gia12], and T-splines [Sed03]. For LR B-splines, the starting point is always a TP B-spline surface with a corresponding mesh of lines defined by the knots in the

**Fig. 1.3** TP and LR meshes. **a** The initial TP mesh with the support of one biquadratic B-spline highlighted, **b** LR mesh after insertion of several meshlines with the support of one biquadratic B-spline highlighted

two parameter directions (a TP mesh). The TP mesh is converted into an LR mesh and new meshlines are inserted into the mesh to refine the surface. The surface coefficients are updated accordingly. The new meshlines do not need to cover the entire region, but must traverse the support of at least one B-spline. Each new meshline leads to one or more B-splines being split to give rise to more B-splines and consequently more approximation freedom. The TP mesh is a special case of an LR mesh. Figure 1.3 illustrates how a TP mesh can be extended to an LR mesh through multiple mesh refinements.

For LR B-spline surfaces the construction of the TP B-splines spanning the spline space is similar to the construction of the TP B-splines spanning the spline space of TP B-spline surfaces. In both cases they are TP B-splines that are defined from a subset of the knot vectors in the two parameter directions. The TP B-splines are regular except for possible variation in the width and height of mesh cells due to varying intervals between adjacent knots. The LR B-splines can differ greatly both in terms of the size of the support and the number of LR B-splines overlapping a particular parameter point in the surface domain. The LR B-splines are non-negative, have limited support, and possess the partition of unity property. The collection of LR B-splines are not linearly independent by default, but possible occurrences of linear dependency can be detected and removed.

Figure 1.4a shows a biquadratic LR B-spline surface approximating the point cloud in Fig. 1.1 (tolerance 0.5 m and 5 iteration steps in an adaptive surface fitting algorithm, see Chap. 3). Here the advantages of local refinement are highlighted:


The accuracy of this surface representation will be addressed in Sect. 1.2.5.1.

**Fig. 1.4** LR B-spline representation of the point cloud shown in Fig. 1.1. **a** The approximating surface, **b** the associated LR mesh

## *1.2.5 Comparison Between Approximation Strategies*

Table 1.1 provides an overview over the surface representations described previously. TINs and LR B-spline surfaces are created by adaptive algorithms, so that the degrees of freedom in the surface can be determined according to the need, and the accuracy of the fit is directly available. The raster representation and the TP B-spline surfaces—a generalization of the raster representation—are less flexible. The TP B-spline surface description is more flexible than the raster due to the choice of polynomial bidegree and/or variable knot vectors. However, the raster representation is slightly more compact than the TP B-spline surface for the representation of a piecewise constant or piecewise bilinear function. The TP B-spline and LR B-spline methods provide smoother surfaces than the other methods. The lack of local refinement, however, implies that the TP B-spline surface size grows much faster compared with LR Bspline surfaces.

#### **1.2.5.1 Comparison Raster/LR B-Spline Surfaces**

Here we wish to point out the advantages of an LR B-spline surface approximation with respect to the raster approximation. To that end, we come back to the point cloud approximated in Sects. 1.2.1 and 1.2.4. Figure 1.5 shows the point cloud coloured according to the distance to (a) the approximated raster surface, and (b) the LR Bspline surface. The raster surfaces were created with IDW and evaluated by linear interpolation. This is not necessarily the optimal approximation method, but it is a method frequently used in GIS. Visually, the distance between the point cloud and the surface is largest for the raster surface: The most distant points are concentrated in areas with much shape variation in the terrain whereas the sea surface is accurately represented in both cases. These results are summarized in Table 1.2. The file sizes


**Table 1.1** Summary of surface formats for representing terrain and seabed

of the raster surfaces (GeoTIFF) are generally larger than for LR B-spline surface (ASCII), so are the distances between the points and the surface. This data set favours the LR B-spline surface with one part representing the horizontal sea surface and one part a terrain with considerable height variations. The property to allow for local variations is the main advantage of LR B-spline surface approximation, not to mention the strong reduced data size of the final surface.

To summarize, LR B-spline surfaces:


**Fig. 1.5** The point cloud from Fig. 1.1, **a** coloured according to the distance to the raster surface with 1 m resolution shown in Fig. 1.1, **b** the LR B-spline surface in Fig. 1.4. The size of the most distant points is increased compared to points closer to the surface for improved visibility. White points lie closer to the surface than 0.5 m, green points lie below the surface and red points above


**Table 1.2** Comparison between raster representation and LR B-spline surface

Raster surfaces are computed with 0.5 and 1 m resolution, LR B-spline surfaces with 5 and 12 iteration steps using a tolerance of 0.5 m. Distances are given in m and point distribution in percentage of the total number of points (1,643,865). MAE = average absolute value of distance, and |*d*| is the absolute value of the distance between a point and the surface

We will highlight these properties in Chaps. 5 and 6.

#### **1.2.5.2 Summary**

Many applications can be derived from approximation of point clouds with mathematical surfaces, such as the drawing of contour lines, or rigorous deformation analysis based on statistical tests. The result of the approximation and the choice of the method depend on the characteristics of the data, the purpose of the surface generation and user defined criteria, such as the computational time. For GIS applications, the LR B-spline surface with adaptive local refinement is favourable and the principle intuitive and understandable ([Sky22] for some examples, [Ker21] for geodetic applications). We point out that neither LR B-spline surfaces, nor raster nor TIN is the most appropriate representation: This latter does not exist. The definition of goodness of fit depends on the applications and the data at hand.

## **1.3 Reminder of the Present SpringerBrief**

In Chap. 2, we will present in details the concepts of LR B-splines and review alternative local approximation methods, such as hierarchical B-splines and T-splines. In the LR B-spline surface approximation with adaptive refinement, parameters are inserted locally, when needed. For geodetic objects such as a bridge or for landslides, this approach is favourable as more details can be needed in domains where, e.g., strong deformations are likely to happen or the object has edges.

The procedure of adaptive refinement will be developed in Chap. 3. The algorithms are optimized for a wide use within a GIS or geodetic context. For approximating geospatial points, biquadratic surfaces have proven to be a good choice [Sky22]. They provide a good balance between smoothness and flexibility. Once an LR B-spline surface representation of some scattered data is obtained, terrain information can be derived such as, e.g., contour curves, slope and aspect ratio. Deformation analysis and statistical tests can be performed at the level of the surface approximation [Sky22, Ker20, Ker21]. We will develop such applications in Chaps. 5 and 6 and present how specific challenges such as data gaps and outliers can be handled efficiently. The concept of LR B-splines volumes will be described. The optimal determination of approximation parameters such as the tolerance with respect to the noise level of the point cloud, is part of Chap. 4, which addresses how to choose less empirically some refinement parameters or strategies.

## **References**


[Wu22] Wu, C., Yuan, Y., Tang, Y., & Tian, B. (2022). Application of terrestrial laser scanning (TLS) in the architecture, engineering and construction (AEC) industry. *Sensors*. https://doi. org/10.3390/s22010265

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 2 Locally Refined B-Splines**

**Abstract** The univariate minimal support B-spline basis (UMB) has been used in Computer Aided Design (CAD) since the 1970s. Freeform curves use UMB, while sculptured surfaces are represented using a tensor product of two UMBs. The coefficients of a B-spline curve and surface are respectively represented in a vector and a rectangular grid. In CAD-intersection algorithms for UMB represented objects, a divide-and-conquer strategy is often used. Refinement by knot insertion is used to split the objects intersected into objects of the same type with a smaller geometric extent. In many cases the intersection of the resulting sub-objects has simpler topology than the original problem. The sub-objects created are represented using their parents' UMB format and deleted when the sub-problem is solved. Consequently, no global representations of the locally refined bases are needed. This is contrary to when locally refined splines are used for approximation of large point sets. As soon as a B-spline is locally refined, the regular structure of UMB objects in CAD is no longer valid. In this chapter we discuss how Locally Refined B-splines (LR B-splines) address this challenge and present the properties of LR B-splines.

**Keywords** Locally Refined B-splines · Minimal support basis · Refinement

# **2.1 Introduction**

An introduction to B-splines, tensor product (TP) B-splines and the univariate minimal support B-spline basis (UMB) is provided in Sect. 2.2. LR B-splines are a generalization of local refinement of curves represented in UMB to the multivariate case, see Sect. 2.3 for details on refinement of B-spline curves. An overview of B-spline based methods for local refinement is addressed in Sect. 2.4, with LR B-splines described in more details in Sect. 2.5. To give a first motivation for the approach of LR B-splines, we will now illustrate the LR B-spline representation for the bivariate case, and introduce the notation used for describing LR B-splines.

We start from the traditional representation of a CAD-type TP B-spline surface.

$$F(\boldsymbol{u}, \boldsymbol{v}) = \sum\_{i=0}^{N\_1 - 1} \sum\_{j=0}^{N\_2 - 1} c\_{i,j} B\_{i, p\_1}(\boldsymbol{u}) B\_{j, p\_2}(\boldsymbol{v}), \quad (\boldsymbol{u}, \boldsymbol{v}) \in [\boldsymbol{u}\_{p\_1}, \boldsymbol{u}\_{N\_1}] \times [\boldsymbol{v}\_{p\_2}, \boldsymbol{v}\_{N\_2}].$$

Here *ci*,*<sup>j</sup>* ∈ *R<sup>d</sup>* , *i* = 0,..., *N*<sup>1</sup> − 1, *j* = 0,..., *N*<sup>2</sup> − 1 are the surface coefficients and *d* is the dimension of the geometry space. Note that for parametric curves and surfaces, when the geometry space has dimension larger than 1 (*d* > 1), i.e., when we have a parametric curve or surface, the coefficients are frequently denoted control points. In this chapter we will use coefficients except for cases where the use of control points is necessary to ensure clarity. We also have two UMBs, one in each parameter direction: *Bi*,*p*<sup>1</sup> (*u*),*i* = 0,..., *N*<sup>1</sup> is an UMB defined over the knot vector {*u*0,..., *uN*1+*d*<sup>1</sup> }, and *Bj*,*p*<sup>2</sup> (v), *j* = 0,..., *N*<sup>2</sup> is an UMB defined over the knot vector {v0,...,v*<sup>N</sup>*2+*d*<sup>2</sup> }. Note that the TP B-splines *Bi*,*j*,*p*1,*p*<sup>2</sup> (*u*, v) = *Bi*,*p*<sup>1</sup> (*u*)*Bj*,*p*<sup>2</sup> (v) are implicitly defined in the equation above. Making the TP B-splines explicit, the equation above can be reformulated to

$$F(\boldsymbol{\mu}, \boldsymbol{\upsilon}) = \sum\_{i=0}^{N\_1 - 1} \sum\_{j=0}^{N\_2 - 1} c\_{i, j} \mathcal{B}\_{i, j, p\_1, p\_2}(\boldsymbol{\mu}, \boldsymbol{\upsilon}), \quad (\boldsymbol{\mu}, \boldsymbol{\upsilon}) \in [\boldsymbol{\mu}\_{p\_1}, \boldsymbol{\mu}\_{N\_1}] \times [\boldsymbol{\upsilon}\_{p\_2}, \boldsymbol{\upsilon}\_{N\_2}].$$

In the equation above, the grid structure of coefficients and TP B-splines is still present. To prepare for LR B-splines we have to reformulate the above to a collection of TP B-splines with corresponding coefficients. Now define

$$\mathcal{B}\_0 = \{ B \mid B = B\_{i,j,p\_1,p\_2}(\mu, \upsilon), i = 0, \dots, N\_1 - 1, j = 0, \dots, N\_2 - 1 \}.$$

This allows us to express *F*(*u*, v) as

$$F(\boldsymbol{\mu}, \boldsymbol{\upsilon}) = \sum\_{\boldsymbol{B} \in \mathcal{B}\_0} c\_{\boldsymbol{B}} \mathbf{s}\_{\boldsymbol{B}} B(\boldsymbol{\mu}, \boldsymbol{\upsilon}).$$

The scaling factors*sB* are introduced to allow scaling of refined B-splines to provide a scaled partition of unity of a collection of LR B-splines. For a TB-surface all *sB* ≡ 1. Note that we index the coefficients *cB* and the scaling factors *sB* by the TP B-spline *B* that it corresponds to.

## **2.2 B-Splines and Tensor Product B-Splines**

Given a non-decreasing sequence **u** = (*u*0, *u*1,..., *u <sup>p</sup>*+1) we define a B-spline *B*[**u**] : <sup>R</sup> <sup>→</sup> <sup>R</sup> of degree *<sup>p</sup>* <sup>≥</sup> 0 recursively as follows [Sch81]

$$B[\mathbf{u}](u) := \frac{u - u\_0}{u\_p - u\_0} B[u\_0, \dots, u\_p](u) + \frac{u\_{p+1} - u}{u\_{p+1} - u\_1} B[u\_1, \dots, u\_{p+1}](u), \quad (2.1)$$

starting with

$$B[\mu\_i, \mu\_{i+1}](\mu) := \begin{cases} 1; & \text{if } \mu\_i \le \mu < \mu\_{i+1}; \\ 0; & \text{otherwise}, \end{cases} \quad i = 0, \ldots, p.$$

We define *B*[**u**] ≡ 0 if *u <sup>p</sup>*+<sup>1</sup> = *u*<sup>0</sup> and terms with zero denominator are defined to be zero.

A univariate spline space can be defined by a polynomial degree *p* and a knot vector **u** = {*u*0, *u*2,..., *uN*+*p*}, where the knots satisfy: *ui*+<sup>1</sup> ≥ *ui* , *i* = 0,..., *N* + *p* − 1, and *ui*+*p*+<sup>1</sup> > *ui* , *i* = 0,..., *N* − 1, i.e, a knot value can be repeated *p* + 1 times. The number of times a knot value is repeated is called the multiplicity *m* of the knot value. The continuity of the spline function across a knot value of multiplicity *m* is *Cp*−*<sup>m</sup>*.

A basis for the univariate spline space above can be defined in many ways. The approach most often used is the univariate minimal support B-spline basis. In this, the B-splines are defined by selecting *p* + 2 consecutive knots from **u**, starting from the first knot. So *Bi*,*<sup>p</sup>*(*u*) := *B*[*ui*,..., *ui*+*p*+1](*u*)is defined by the knots *ui*,..., *ui*+*p*+1, *i* = 0,..., *N* − 1. The minimal support B-spline basis has useful properties that ensure numeric stability such as local support, non-negativity and partition of unity (the basis functions sum up to one in all parameter values in the interval [*u <sup>p</sup>*, *uN* ]).

Given two non-decreasing knot sequences **u** = {*u*0, *u*1,..., *uN*1+*p*<sup>1</sup> } and **v** = {v0, v1,...,v*<sup>N</sup>*2+*p*<sup>2</sup> } where *p*<sup>1</sup> ≥ 0 and *p*<sup>2</sup> ≥ 0. We define a bivariate TP B-spline *Bi*,*j*,*p*1,*p*<sup>2</sup> : <sup>R</sup><sup>2</sup> <sup>→</sup> <sup>R</sup> from the two univariate B-splines *Bi*,*p*<sup>1</sup> (*u*) and *Bj*,*p*<sup>2</sup> (v) by

$$B\_{i,j,p\_1,p\_2}(\mu,\upsilon) := B\_{i,p\_1}(\mu) B\_{j,p\_2}(\upsilon).$$

The support of *B* is given by the Cartesian product

$$\text{supp}(B\_{i,j,p\_1,p\_2}) := [\mu\_i, \mu\_{i+p\_1+1}] \times [\upsilon\_j, \upsilon\_{j+p\_2+1}].$$

A bivariate TP spline space is made by the tensor product of two univariate spline spaces. Assuming that both univariate spline spaces have a minimal support B-spline basis, the minimal support basis for the TP B-spline space is constructed by making all tensor product combinations of the B-splines of the two bases. The minimal support basis for this spline space contains the TP B-splines *Bi*,*j*,*p*1,*p*<sup>2</sup> (*u*, v), *i* = 0,..., *N*<sup>1</sup> − 1, *j* = 0,..., *N*<sup>2</sup> − 1. As in the univariate case, the basis has useful properties such as non-negativity and partition of unity.

## **2.3 Refinement of B-Spline Curves**

Spline curves are frequently represented using a univariate minimal support B-spline basis.

$$f(u) = \sum\_{i=0}^{N-1} c\_i B\_{i,p}(u), u \in [u\_p, u\_N]. \tag{2.2}$$

Here *ci* ∈ *R<sup>d</sup>* , *i* = 0,..., *N* − 1 are the curve coefficients and *d* is the dimension of the geometry space. The curve lies in the convex hull of its coefficients.

A B-spline curve can be locally refined. Figure 2.1a shows a quadratic curve with knot vector {0, 0, 0, 1, 2, 3, 3, 3}. The curve coefficients and the control polygon corresponding to the curve are included in Fig. 2.1, and the associated B-splines are shown below. In Fig. 2.1b, a new knot with value 2 is added, thus increasing the knot multiplicity in an already existing knot. The curve is not altered, but the control polygon is enhanced with a new coefficient. The coefficient is marked with a circle in Fig. 2.1b. The double knot at 2 allows creating a curve with *C*<sup>0</sup> continuity. When moving the coefficient marked with a square, we obtain a sharp corner as in Fig. 2.1c. Note that only the last part of the curve is modified.

**Fig. 2.1** Knot insertion into a quadratic B-spline curve. **a** Initial curve with basis functions, **b** curve and basis functions after knot insertion, **c** curve and basis functions after re-positioning of one coefficient

## **2.4 B-Spline Based Locally Refined Surface Methods**

As described in Sect. 2.3, adding an extra local degree of freedom to a univariate B-spline basis just adds a knot, and an additional coefficient. However, adding an extra degree of freedom in one of the univariate minimal support B-spline basis of a TP B-spline represented surface adds an extra row or column of coefficients in the coefficient grid, with a resulting large increase in the bulk of the surface representation. Consequently, for application areas where it is desirable to add extra degrees of freedom/representation power locally just where needed, TP B-spline representations are not a good solution.

In some applications such as Isogeometric Analysis (IgA) [Hug05], and the representation of terrain and seabed, the lack of local refinement is a severe restriction. A TP B-spline surface covers a rectangular domain and the need for approximation power will not in the general case be uniformly distributed throughout the domain. In IgA the traditional shape functions local to each element in Finite Element Analysis are replaced by B-splines that cross element boundaries and connect elements with higher order continuity.

There are three main B-spline based approaches for extending spline surfaces to support local refinement [Dok19].


a sequence of nested spline spaces. However, T-spline subtypes such as semistandard T-splines and Analysis Suitable T-splines do so by imposing restrictions on how to refine, see Scott et al. [Sco11].

• **Locally Refined (LR) B-splines** [Dok13, Joh13] the refinement approach of this book, starts (as T-splines, HB and THB) from a TP B-spline surface. The refinement is performed successively by inserting axis parallel meshlines in the mesh of knotlines (from here on denoted the mesh). Each meshline inserted has to split the support of at at least one TP B-spline. The constant knot value of a meshline inserted is used for performing univariate knot insertion. The refinement is performed in the parameter direction orthogonal to the meshline in all TP B-splines that have a support split by the meshline. This approach ensures that the spline spaces produced are nested and that the polynomial space is spanned over all polynomial elements. In Sect. 2.5, we provide further details on additional refinements triggered and how the resulting TP B-splines can be scaled to achieve partition of unity.

Note that a main distinction between T-spline algorithms on the one side, and LR B-spline, HB and THB algorithms on the other side is that T-spline algorithms work in the mesh of control points and find the collection of B-splines traversing the mesh of control points, while LR B-spline, HB and THB algorithms directly refine the spline space and thus automatically ensure nested spline spaces.

## **2.5 LR B-Spline Refinement Method**

The process of Locally Refined B-splines is described in detail in [Dok13]. Please consult the paper for formal proofs. Below a summary of the most important steps is presented.

The refinement always starts from a TP B-spline space *B*0. The refinement proceeds with a sequence of meshline insertions producing a series of collections of TP B-splines *B*0, *B*1,..., *B<sup>k</sup>* , *B<sup>k</sup>*+<sup>1</sup> spanning nested spline spaces each providing a refined surface

$$F(u,v) = \sum\_{B \in \mathcal{B}\_l} c\_B s\_B B(u,v), \quad i = 0, 1, \ldots, k, k+1. \tag{2.3}$$

Above we end with *k* + 1, as we will now detail how to create *B<sup>k</sup>*+<sup>1</sup> from *B<sup>k</sup>* .

Note that we require that all TP B-splines in these collections have minimal support. By this we mean that all meshlines that cross the support of a TP B-spline also have to be a line in the mesh representing the knots of the TP B-splines counting multiplicity. We denote these meshlines knotlines of the TP B-spline. Thus, to ensure that all TP B-splines have this property, the process of going from *B<sup>k</sup>* to *B<sup>k</sup>*+<sup>1</sup> frequently includes a number of intermediate steps and a corresponding sequence of intermediate LR B-spline collections.

Assume that Eq. 2.3 represents *F*(*u*, v) using the collection *B<sup>k</sup>* and we now want to represent *F*(*u*, v) using a refined collection of minimal support TP B-splines *B<sup>k</sup>*+1. The refinement process goes as follows:

	- If γ is parallel to second parameter direction then it has a constant parameter value *a* in the first parameter direction. We insert *a* in the univariate B-spline *B*(*u*) using Eq. 2.5 below and express *B*(*u*) as *B*(*u*) = α1*B*1(*u*) + α2*B*2(*u*). Then we make two new TP B-splines *B*1(*u*, v) = *B*1(*u*)*B*(v) and *B*2(*u*, v) = *B*2(*u*)*B*(v).
	- If γ is parallel to first parameter direction then it has a constant parameter value *a* in the second parameter direction. We insert *a* in the univariate B-spline *B*(v) using Eq. 2.5 below and express *B*(v) as *B*(v) = α1*B*1(v) + α2*B*2(v). Then we make the two new TP B-splines *B*1(*u*, v) = *B*(*u*)*B*1(v) and *B*2(*u*, v) = *B*(*u*)*B*2(v).

*B* can be decomposed as follows, *B*(*u*, v) = α1*B*1(*u*, v) + α2*B*2(*u*, v). We can now express *F*(*u*, v) by replacing *B*(*u*, v) with the two new TP B-splines. *F*(*u*, v) = *F*(*u*, v) − *cBsB B*(*u*, v) + *cBsB*(α1*B*1(*u*, v) + α2*B*2(*u*, v)). We update *B<sup>k</sup>* by removing *B* and adding the TP B-splines *B*1(*u*, v) and *B*2(*u*, v). In addition we have to create/update both coefficients and scaling factors belonging to these two TP Bsplines. We must have in mind that *B*1(*u*, v) and *B*2(*u*, v) often will be duplicates of B-splines already in *B<sup>j</sup>* . Now let *Br*,*r* = 1, 2


Note that *sB*, α*<sup>r</sup>* and *sBd* are all positive numbers, thus *sBr* will be positive.

When all *B* ∈ *B<sup>k</sup>* have minimal support, we set *B<sup>k</sup>*+<sup>1</sup> = *B<sup>k</sup>* .

To simplify notation as in Chap. 3, we now define the scaled TP B-splines *NB*(*u*, v) = *sB B*(*u*, v) to provide a basis that is partition of unity for the spline space spanned by *B<sup>j</sup>* . If *F*(*x*, *y*) ≡ 1 then all coefficients of the TP B-spline surface we start from, are 1. In this case the coefficients *cBr* calculated above all remain 1, duplicates or not. Consequently,

$$\sum\_{B \in \mathcal{B}\_k} N\_B(\mu, \upsilon) = \sum\_{B \in \mathcal{B}\_k} s\_B B(\mu, \upsilon) = \sum\_{B \in \mathcal{B}\_0} B(\mu, \upsilon) = 1. \tag{2.4}$$

The process of inserting a knot *a* ∈ (*u*0, *u <sup>p</sup>*+<sup>1</sup>) into the local knot vector [**u**] = {*u*0,..., *u <sup>p</sup>*+<sup>1</sup>} belonging to a univariate B-spline *B*[**u**], of degree *p* was first described by Boehm [Boe80]. We organize the resulting sequence of knots as a

**Fig. 2.2** Parameter domain of an LR B-spline surface with indication on B-spline support. The mesh is shown as black lines. The support of two overlapping B-splines are shown in red and in blue

non-decreasing knot sequence { ˆ*u*0,..., *u*ˆ *<sup>p</sup>*+2}. From this we make two new Bsplines *B*1[**u**1] and *B*2[**u**2] with corresponding knot vectors [**u**1] = {ˆ*u*0,..., *u*ˆ *<sup>p</sup>*+1} and [**u**2] = {ˆ*u*1,..., *u*ˆ *<sup>p</sup>*+2}. Then

$$B[\mathbf{u}] = \alpha\_1 B\_1[\mathbf{u}\_1] + \alpha\_2 B\_2[\mathbf{u}\_2],\tag{2.5}$$

where

$$\begin{aligned} \alpha\_1 &:= \begin{cases} \frac{a - u\_0}{u\_p - u\_0}, & \mu\_0 < a < u\_p, \\ 1, & \mu\_p \le a < u\_{p+1}, \end{cases} \\ \alpha\_2 &:= \begin{cases} 1, & \mu\_0 < a \le u\_1, \\ \frac{u\_{p+1} - a}{u\_{p+1} - u\_1}, & \mu\_1 < a < u\_{p+1}. \end{cases} \end{aligned} \tag{2.6}$$

The incremental refinement by knot insertion used by LR B-splines ensures that the spline spaces generated are nested. Figure 2.2 shows a parameter domain and the segmentation into boxes corresponding to a biquadratic LR B-spline surface. In addition the support of two TP B-splines is shown. We see that a meshline stops inside the blue TP B-spline support. This meshline is excluded from the local knot vectors defining the TP B-spline covering this support.

Linear independence of the resulting collection *B<sup>k</sup>*+<sup>1</sup> of LR B-splines is not guaranteed, but occurrences are rare for the constructions used in this book. The approximation procedure outlined in Chap. 3 does not depend on linear independence. LR B-spline linear dependency configurations can, however, be resolved, if needed by an application, by insertion of extra meshlines. Linear dependence of LR B-splines is addressed in detail in [Pat20].

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 3 Adaptive Surface Fitting with Local Refinement: LR B-Spline Surfaces**

**Abstract** A locally refined (LR) B-spline surface is a piecewise polynomial surface for which the distribution of the surface coefficients can be locally adapted. Such a mathematical representation is interesting for fitting scattered and noisy data, as the local behaviour of a real point cloud may require more degrees of freedom only locally. The number of redundant surface coefficients is minimized, which avoids the fitting of the point cloud's noise. The surface approximation is performed iteratively either by solving a least squares system or by a local approximation method. This procedure allows for mesh refinement in domains where the distance between a current surface and the point cloud exceeds a prescribed tolerance. In this way, parts of the LR B-spline surface obtained at previous steps may be kept unchanged. This chapter aims at explaining the adaptive fitting using local refinement with LR B-splines. We present two examples with simulated point clouds to illustrate the methodology.

**Keywords** LR B-splines surface · Adaptive refinement · Surface fitting · Local refinement · Least-squares · Multilevel B-spline Approximation (MBA)

# **3.1 Adaptive Local Refinement**

In this chapter, we will let the *x*- and *y*-coordinates of the points serve as the parameter values while the surface (or function) approximates the *z*-component of the data. We note that the algorithm handles parameterized points as well, for which each point is given with a parameter pair and 3D coordinates, see [Flo05] for a review of the different parameterization methods. In the following we will focus on elevation and therefore denote the surface parameters *x* and *y*.

# *3.1.1 General Principle of Adaptive Spline Refinement*

We distinguish between two type of methods for fitting point clouds using local refinement:


Adaptive local refinement procedures are favorable in terms of the number of coefficients that defines the final surface. The proposed adaptive local refinement will be performed as follows:


We address alternative refinement strategies in Sect. 3.2, while the technical details of the refinement procedure is described in details in Chap. 2 and briefly in Sect. 3.1.2. The principle of adaptive refinement is summarized in Algorithm 1.

3. Last step: A geospatial point cloud may represent a very rough terrain and contain noise and possible outliers.We usually stop the iteration before all points are closer to the surface than the tolerance to avoid fitting the noise. The refinement process can be stopped by the number of iterations or by statistical criteria as described in Chap. 4.

**Algorithm 1:** Principle of adaptive surface approximation with LR B-spline surfaces.

**Data**: Point cloud, maximum number of iterations, tolerance **Result**: Approximating surface, information on approximation accuracy Generate initial surface; Compute accuracy; **while** *there exist points with larger distance than the given tolerance and the maximum number of iterations is not reached* **do** Refine the surface in areas where the tolerance is not reached; Perform approximation in the current spline space; Compute accuracy;

## *3.1.2 Refining the LR B-Spline Surface*

The initial surface is the starting point for the adaptive surface approximation. It is a TP B-spline surface represented as an LR B-spline surface. The parameter domain of such a surface is defined by a regular mesh of lines (the initial tensor mesh represented as a LR mesh). The meshlines split the domain into rectangles (boxes), each corresponding to a polynomial piece of the surface. Axis parallel meshline segments are successively inserted into the surface where the accuracy requirements are not met and have to satisfy:


The resulting mesh defines a collection of none overlapping boxes. A box can touch but don't overlap adjacent boxes. The union of the boxes corresponds to the union of the boxes of the TP mesh. The B-splines spanning the spline spaces are created in parallel to the refinement of the LR mesh by splitting existing B-splines that no longer have minimal support. This way, the number of internal meshlines traversing the entire support of the B-spline does not exceed the polynomial degree. Note that meshline multiplicity has to be included in the count. Figure 3.1a shows an initial TP mesh corresponding to a biquadratic surface prepared for local refinement. The supports of two overlapping B-splines are shown with brown vertical lines and green diagonal lines. A new meshline (black) is entered. Since it covers the support of the brown B-spline, it is a legal refinement. In Fig. 3.1b, we see that the B-spline corresponding to the brown support is split. The new supports are depicted with brown vertical and brown horizontal lines, respectively. The line does not traverse the green support, and this B-spline remain unchanged. The new line has been included in the mesh, but not in the definition of the green B-spline. The line also leads to the

**Fig. 3.1** Initial TP mesh (blue) with a meshline to insert (black). **a** The supports of two B-splines are shown prior to insertion. **b** After insertion, the brown B-spline is split while the green B-spline is unchanged. The black candidate line has become a part of the LR mesh and turned into blue

splitting of two more B-splines (not shown here for the sake of simplicity). Since two pairs of the new B-splines are identical, the total increase in the number of B-splines, and consequently the surface coefficients, is one.

## *3.1.3 Goodness of Fit of the Approximation*

Following [Sky15], we define the performance indicators:


$$MAE = \frac{1}{n\_{obs}} \sum\_{i=1}^{n\_{obs}} |z\_j - \hat{z}\_j|, \quad \mathbf{z} = \{z\_j\}\_{j=1}^{n\_{obs}} \text{ and } \hat{\mathbf{z}} = \{\hat{z}\_j\}\_{j=1}^{n\_{obs}}$$

.

We have *MAE* <sup>≤</sup> *RMSE* <sup>≤</sup> <sup>√</sup>*nobs* <sup>×</sup> *MAE*.


## **3.2 Refinement Strategies**

During the adaptive surface approximation, we maintain local information on the approximation accuracy for each mesh cell. The cells with insufficient accuracy are easily identified, but it is not obvious that every such cell should trigger refinement at every iteration step in the adaptive algorithm.

# *3.2.1 Isogeometric Analysis Versus Scattered Data Approximation*

Some studies on the properties of various refinement strategies for LR B-splines and other LR splines exist, mostly in the context of Isogeometric Analysis (IgA). Johannessen et al. [Joh13] apply the strategies they named**Full span**,**Minimum span** and **Structured mesh** combined with LR B-splines for IgA. Bracco et al. [Bra18] also focus on IgA and study two classes of meshes for hierarchical B-splines. Hennig et al. [Hen17] compare two refinement strategies for hierarchical B-splines and Tsplines. The refinement of LR B-spline surfaces from the perspective of maintaining local linear independence has been addressed in [Bre15] and [Pat20], respectively.

The use of LR splines for scattered data approximation differs from the use in IgA by the persistence of the surface. In the approximation setting, the computed surface is the final result while in IgA, the surface (or volume) is an intermediate step in the computations, i.e., the surface is not kept for further use. When approximating a point cloud, the variation of the underlying surface represented by the data points can be deduced during the approximation. A tailor-made selection of surface coefficients is possible. In IgA it is only possible to know which degrees of freedom are necessary first when the simulation is completed. An extensive introduction of new degrees of freedom may be appropriate. We will focus on approximation accuracy related to data size as in [Sky22].

## *3.2.2 Refinement Strategies in the Approximation Context*

In the following, we focus on the approximation of geospatial data and present a variety of refinement strategies. The approximation algorithm coincides with the one presented in Sect. 3.1.1 and the iteration is pursued until the prescribed tolerance is met or no further accuracy improvements are possible. Here we give a short overview over the refinement strategies pursued in [Sky22], which are partly the same as in [Joh13]. The methods will be applied to various data sets in Chaps. 5 and 6.

The strategies are:


Restricted mesh (R): A B-spline is selected for refinement, and refined in its support by inserting new knots in a subset of the knot intervals of the B-spline. Knot intervals towards the middle of the support, large intervals compared to the size of the support, and intervals corresponding to cells with a high number of points having a residual value larger than the tolerance are subject to knot insertion.

Given a mesh cell where the accuracy criteria are not met, the approximation flexibility will increase locally if one B-spline containing the cell in its domain is split. A good refinement strategy will insert enough new meshlines to resolve the criteria in a substantial number of cells without starting to adapt to noise. A slow pace in the introduction of meshlines in general gives surfaces with few coefficients, but increases the computational time to some extent. On the other hand, a very restrictive introduction of lines may lead to a blocking meaning that the approximation accuracy is not improved despite an introduction of new mesh lines in areas where the tolerance is not met. The pace of the refinement can be reduced by selecting a restrictive strategy (Mc or R), by refining in one parameter direction at a time (A) as opposed to refining in both directions (B), or by applying some kind of threshold. The latter means that not all candidate mesh cells or B-splines are refined at each iteration level. The candidates are sorted according to selected criteria involving the number of points outside the tolerance belt and the residual value in these points [Sky22]. Only the most significant candidate cells are refined.

To summarize, the full span strategy (F) shows the most stable behaviour for the diversity of test cases investigated. Alternating parameter directions (A) gives fewer coefficients than the full span strategy in general, with an acceptable increase in computational time. Thresholding can be beneficial, but these results were slightly less conclusive. Furthermore bidegree (2, 2) is preferred over bidegrees (1, 1) and (3, 3). Bicubic polynomials generally produces more coefficients without delivering better accuracy. Bilinear polynomials often results in lean surfaces, but the stability suffers for some data sets and some refinement strategies. We will build on the results from [Sky22] and complement them with statistical considerations in Chap. 4. Further examples are given in Chaps. 5 and 6.

Figures 3.2 and 3.3 show the resulting mesh after refinement triggered by one mesh cell or B-spline using different strategies. The initial surface is biquadratic and defined on a uniform mesh with seven inner knots in each parameter direction. Normally, a number of refinements is performed at every iteration step, but here only one cell or B-spline support is selected for illustration purposes.

**Fig. 3.2** Refinement strategies. **a** Initial mesh, selected cell is yellow. **b** Full span in one parameter direction at a time (FA). **c** Full span in both parameter directions (FB). **d** Minimum span in one parameter direction at a time (McA). **e** Minimum span in both parameter directions (McB)

**Fig. 3.3** Refinement strategies. **a** Initial mesh, selected B-spline support is yellow. **b** Structured mesh in one parameter direction at a time (SA). **c** Structured mesh in both parameter directions (SB). **d** Restricted mesh in one parameter direction at a time (RA). **e** Restricted mesh in both parameter directions (RB)

## **3.3 Surface Approximation**

At each step in the iterative surface fitting algorithm, the coefficients of the current surface are computed to obtain the best fit to the point cloud. The approximation is performed either using a least squares (LS) approach or the multilevel B-spline approximation (MBA).

## *3.3.1 Least Squares Approximation*

In this section, we introduce the concept of LS approximation within the framework of surface fitting. We restrict ourselves to the main formulas for the sake of simplicity. More details can be found in the references.

#### **General Formulation**

The LS approximation is a global method. The following expression is minimized with respect to the surface coefficients *cB* over the entire surface domain:

$$\min\_{\mathbf{c}} \left[ \alpha\_1 J(F(\mathbf{x}, \mathbf{y})) + \alpha\_2 \sum\_{h=1}^{n\_{ab}} (F(\mathbf{x}\_h, \mathbf{y}\_h) - z\_h)^2 \right]. \tag{3.1}$$

**x** = (*xh*, *yh* ,*zh*), *h* = 1,..., *nobs* are the data points. The surface is defined as *F*(*x*, *y*) = *<sup>B</sup>*∈*B<sup>j</sup> cBNB*(*x*, *<sup>y</sup>*), where *NB*, *<sup>B</sup>* <sup>∈</sup> *<sup>B</sup><sup>k</sup>* , are the scaled TP B-splines defined as in Sect. 2.5. Note that we index by TP B-splines in the collection *B<sup>k</sup>* . Thus, the LS method minimizes a scaled version of RMSE. The expression is differentiated and turned into a linear, sparse equation system in the number of surface coefficients. The equation system is solved iteratively using a pre-conditioned conjugate gradient method. A pure LS approximation method will result in a singular equation system if there exist B-splines with no data points in their support.

#### **Smoothness Function**

A typical point cloud has a non-rectangular outline and may contain voids. Parts of the surface domain will frequently lie outside the domain of the point cloud. This challenge is addressed by adding a smoothness term *J* (*F*(*x*, *y*)) to the minimization functional. It allows to solve a non-singular equation system even if this situation occurs. Thus the smoothness term approximates the minimization of curvature and variations of curvature in the surface. These intrinsic measures are made parameter dependent to give rise to a linear equation system after differentiation. The smoothness term is expressed as:

$$J(F(\mathbf{x}, \mathbf{y})) = \int \int \int\_0^\pi \sum\_{i=2}^3 w\_i \left( \frac{\partial^i F(\mathbf{x}\_0 + r \cos \phi, \mathbf{y}\_0 + r \sin \phi)}{\partial r^i} \Big|\_{r=0} \right) \mathbf{d}\phi \mathbf{dx}\_0 \mathbf{d}\mathbf{y}\_0. \tag{3.2}$$

At each point(*x*0, *y*0)in the surface domain , a weighted sum of the directional first and second derivatives of the surface is integrated around the circle and the result is integrated over the surface domain. The directional derivative is represented by the polar coordinates φ and *r*. The two terms in *J* (*F*(*x*, *y*)) are given equal weight. The weight on the smoothness term is kept low to emphasize the approximation accuracy. In the examples of the next sections α<sup>1</sup> = 1.0 × 10−<sup>9</sup> and α<sup>2</sup> = 1 − α1. A more detailed description of the procedure can be found in [Meh97]. A suite of alternative smoothness terms is presented in [Now98].

## *3.3.2 Multilevel B-Spline Approximation*

MBA is a local, iterative approximation method, [Lee97]. It was originally developed for a hierarchy of TP B-spline functions. Given a current surface *Fk* (*x*, *y*), the residuals corresponding to the data points are defined and a surface *Gk* (*x*, *y*) approximating these residuals is computed. In the hierarchical B-spline setting, a set of subdomains where a prescribed tolerance is not met is identified and residual surfaces are computed for these domains only, [Zha98]. A special construction is applied to maintain continuity of the final hierarchical surface at the boundaries between these subdomains and the remaining parts of the surface.

#### **Computing the Coefficients of the Residual Surface**

The coefficients *qB* of the residual surface *Gk* (*x*, *y*) = *<sup>B</sup>*∈*<sup>B</sup> qBNB*(*x*, *<sup>y</sup>*) are computed individually for each scaled TP B-spline *NB*. Each coefficient *qB* is determined from the collection of data points *P* = (*x <sup>p</sup>*, *yp*,*z <sup>p</sup>*) ∈ *P<sup>B</sup>* in the support of *NB*. Let *r <sup>p</sup>* = *z <sup>p</sup>* − *Fk* (*x <sup>p</sup>*, *yp*) be the residual corresponding to the point (*x <sup>p</sup>*, *yp*,*z <sup>p</sup>*). The coefficient is computed as

$$q\_B = \frac{\sum\_{P \in \mathcal{P}\_\mathcal{B}} N\_B(\mathbf{x}\_p, \mathbf{y}\_p)^2 \phi\_{B,P}}{\sum\_{P \in \mathcal{P}\_\mathcal{B}} N\_B(\mathbf{x}\_p, \mathbf{y}\_p)^2},\tag{3.3}$$

where φ*<sup>B</sup>*,*<sup>P</sup>* , addressed below, depends both on the residuals of the points in *P<sup>B</sup>* and the collection, *B<sup>T</sup>* , of B-splines with a support overlapping at least one point in *PB*. By default *NB* ∈ *B<sup>T</sup>* . For each of the residuals, we define an under-determined equation system

$$r\_p = \sum\_{C \in \mathcal{B}\_T} \phi\_{C,P} N\_C(\mathbf{x}\_p, \mathbf{y}\_p), \ P \in \mathcal{P}\_B. \tag{3.4}$$

where φ*<sup>C</sup>*,*<sup>P</sup>* are unknowns to be determined. There are many solutions to this equation system. For MBA, the solution is computed as a pseudo-inverse, see [Hsu92], giving

$$\phi\_{C,P} = \frac{N\_C(\mathbf{x}\_p, \mathbf{y}\_p)r\_p}{\sum\_{C' \in \mathcal{B}\_T} N\_{C'}(\mathbf{x}\_p, \mathbf{y}\_p)^2}, \ C \in \mathcal{B}\_T, \ P \in \mathcal{P}\_B. \tag{3.5}$$

This solution minimizes

$$\sum\_{P \in \mathcal{P}\_{\mathcal{B}}} \phi\_{C,P}^2 \tag{3.6}$$

in the LS sense.

Finally we select φ*<sup>B</sup>*,*<sup>P</sup>* , *P* ∈ *P<sup>B</sup>* as the missing piece in Eq. 3.3. The process is explained in more detail in [Zha98].

#### **Updating the LR B-spline Surface**

The current surface and the residual is represented by a collection of LR B-splines following the procedure of Sect. 2.5. Thus the updated surface can be computed as

$$F\_{k+1}(\mathbf{x}, \mathbf{y}) = F\_k(\mathbf{x}, \mathbf{y}) + G\_k(\mathbf{x}, \mathbf{y}) = \sum\_{B \in \mathcal{B}} (c\_B + q\_B) N\_B(\mathbf{x}, \mathbf{y}). \tag{3.7}$$

MBA is an iterative process. Experience indicates that repeated applications of the approximation algorithm without adding more mesh lines improves the approximation accuracy. The improvement is stopped when the accuracy is restricted by the potential in the current collection of B-splines.

In [Sky15], the two approximation algorithms are compared in various examples. In general, the LS algorithm has a better approximation order while the MBAalgorithm is more stable when the spline space is less regular and/or the number of points in each element is low. We will discuss this topic further in Chap. 5.

## *3.3.3 Summary of the Adaptive Refinement*

The LS method provides the best approximation to the point cloud in the L2 norm. Adding the smoothness term with a small weight maintains the good approximation properties. However, some supports of the smallest B-splines can contain none or few points in case of unevenly distributed scattered data points. Moreover, there is a risk of overfitting with LS, leading to ripples in the surface, see [Bra20]. Thus, the MBA method should be preferred after a few iterations with LS. As stated in Eq. 3.6, the pseudo inverse tends to minimize the deviation of the residual surface from zero. The procedure of local refinement combining LS and MBA is illustrated in Fig. 3.4 and can be summarized as follows:


**Fig. 3.4** Combination of MBA and LS for adaptive surface approximation with LR B-splines

surface coefficients and consequently the size of the LS equation system becomes large.

## **3.4 Example of Adaptive Refinement**

In the following section, we will illustrate how adaptive fitting with local refinement using LR B-splines works. To that end, two simulated point clouds are generated.

**Fig. 3.5** Visualization of the mathematical functions. **a** A Gaussian bell with a dam-like jump. **b** Three peaks on a flat ground

## *3.4.1 Generation of Reference Point Clouds*

The reference surfaces correspond to Fig. 3.5a: A smooth geometry and Fig. 3.5b: A geometry with sharp edges. For each generated point cloud, we set (*x*, *y*) ∈ [−1, 1] 2 with *nobs* = 40,000 scattered data points (*xi*, *yi*,*zi*), *i* = 1...*nobs*. The *z*-component is obtained from the proposed mathematical equations:

$$\text{Point cloud (a): } z = \frac{\tanh\left(10\text{y} - 5\text{x}\right)}{4} + \frac{1}{5\text{e}^{(\text{fs}-2.5)^2 + (\text{5y}-2.5)^2}}.\tag{3.8}$$

The point cloud is illustrated in Fig. 3.5a and corresponds to a Gaussian bell as would be a mountain in real life, and a dam-like jump with a smooth transition.

$$\begin{split} \text{Point cloud (b): } z &= \frac{1}{3\text{e}\sqrt{(10x-3)^2 + (10y-3)^2}} + \frac{2}{3\text{e}\sqrt{(10x+3)^2 + (10y+3)^2}} \\ &+ \frac{3}{3\text{e}\sqrt{(10x)^2 + (10y)^2}} \cdot \text{s.} \end{split} \tag{3.9}$$

The surface represents three peaks with different levels of altitudes on a flat ground, see Fig. 3.5b. We conjecture that the large gradient and the edges are challenging to approximate with LR B-splines. We refer to Chap. 1 for a discussion on approximation methods.

**Fig. 3.6** Meshes at different level of approximation for point cloud (a). tolerance 0.007, refinement strategy FA, polynomial bidegree (2,2)

## *3.4.2 Results of Simulations*

To illustrate how adaptive surface fitting with LR B-splines performs, we use the two point clouds (a) and (b). Following Algorithm 1, we set a maximum of 10 iterations and use a tolerance of 0.007. We present the meshes and some corresponding surfaces for point cloud (a) in Figs. 3.6 and 3.7, and in Fig. 3.8 for point cloud (b). Additionally, we compute the MAE, the maximum error, the number of points outside tolerance and the computational time at each iteration, see Tables 3.1 and 3.2 for point cloud (a) and (b), respectively. We focus on the full span refinement strategies with alternating parameter directions (FA), as described in Sect. 3.2 and choose the polynomial bidegree (2, 2) for the splines.

#### **Point cloud (a)**

After 4 iterations, the algorithm switches to the MBA strategy and stops after the 6th iteration. Here the number of points outside tolerance *nout* reaches 0 and the maximum error *Maxerr* is 0.0046. Approximately 1300 coefficients are needed to approximate the 40,000 points in less than 0.21 s. These values highlight the potential of surface fitting with LR B-splines to approximate in a short amount of time a high

**Fig. 3.7** Final fitted surface after 6 iterations for point cloud (a) with tolerance 0.007

**Fig. 3.8** Top: Meshes at different iteration steps for point cloud (b). tolerance 0.007, refinement strategy FA. Bottom: The fitted surface for level 0 and 10


**Table 3.1** Results of fitting point cloud (a)

0 corresponds to first iteration (coarser level). The refinement stops after 6 iterations as *nout* = 0

number of observations. The MAE reaches 0.00041 at the last iteration level and is close to the tolerance after the 3rd iteration (0.0085 vs. 0.007). Here only 473 coefficients have to be estimated, which is 400 times lower than the total number of points. We point out that a lower tolerance would have led to a higher number of coefficients and more iterations. The choice of the tolerance is often left to the user's convenience and will be discussed in Chap. 4 by means of statistical criteria.

Figure 3.7 provides a visualization of the LR B-spline surface at the 6th iteration. This latter is close to the original one. At the 1st refinement level, the mesh is split by series of vertical meshlines. After the 3rd iteration the refinement is performed near the slope and the bell. As the number of iterations increases, the precision of the refinement in this area is getting higher.

#### **Point cloud (b)**

Unlike the approximation of point cloud (a), 4 points are still outside tolerance for fitting point cloud (b). They are located at the highest peak in the middle of the surface. We use the same tolerance of 0.007 and a maximum of 10 iterations. At that point, *Maxerr* reaches 0.017 and the MAE 0.00037. Although the size of the two data sets is the same, the geometry of point cloud (b) is more challenging to approximate with an LR B-spline surface. We further note that MAE saturates at the 7th iterations. Increasing the levels of refinement from the 7th to the 10th does not lead to a significant improvement in accuracy (0.00039 vs. 0.00037) for 92 additional coefficients (800 vs. 892). We note that the *CT* remains under 0.3 sec after 10 iterations. This is a slightly higher *CT* than for point cloud (a) due to the increase of iteration levels. The splitting of each B-spline becomes more costly as the number of B-splines increases. From Fig. 3.8, the refinement leads to a mesh having more meshlines in the domain corresponding to the three peaks, which is expected from a local refinement strategy. The adaptive fitting with local refinement using LR B-splines has a high performance regarding the geometry with sharp edges and curvature changes (Table 3.2).


**Table 3.2** Results of fitting point cloud (b)

0 corresponds to first iteration (coarser level). The maximum number of iterations was set to 10. Tolerance 0.007

## **3.5 Conclusion**

In this chapter, we have introduced the adaptive fitting with LR B-splines and explained four refinement strategies. The latter can be performed either in one or two directions; The choice of the method depends on the point clouds and the applications at hand, e.g., if the computational time or mesh regularity are important criteria. We have described the LS method to perform surface approximation, to which a smoothness term can be added. This approximation strategy is used in the first iterations of the adaptive algorithm, starting from a coarse mesh. In a second step, we described the MBA, which provides an explicit expression of the coefficients, thus avoiding matrix inversion without losing goodness of fit. An example with simulated observations have shown both the simplicity, conciseness and accurateness of the approximation method with local refinement.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 4 A Statistical Criterion to Judge the Goodness of Fit of LR B-Splines Surface Approximation**

**Abstract** The surface approximation obtained with adaptive strategies using locally refined (LR) B-splines depends on the degrees of freedom of the spline space, the tolerance from which the refinement is performed, the noise level of the scattered observations, the refinement strategy and the bidegree of the spline space. The choice of the best model is a challenging task that can be partially answered with statistical criteria, such as the Akaike Information Criterion (AIC). Here we relax the assumption that the approximation error should be normally distributed and with equal variance and propose the use of the student distribution to compute the AIC. We apply the AIC to decide which tolerance, refinement level, or polynomial bidegree are the most adequate for an optimal fitting. We highlight how the resulting AIC can be combined with more usual criteria to judge the goodness of fit of the surface approximation.

**Keywords** Information criterion · AIC · Surface approximation · t-distribution · Locally Refined B-splines · Local refinement

# **4.1 Introduction**

A surface approximation of a point cloud can be done either globally (non-adaptive methods), or with locally adaptive methods. The adaptive surface fitting with LR B-splines used in this SpringerBrief belongs to the latter category. LR B-splines can be viewed as a generalization of univariate non-uniform B-splines, see [Dok13] for more details. The approximation of point clouds is performed step by step and the mathematical surface at each new iteration step depends on the result of the previous one [Sky22]. Contrary to Non Uniform Rational B-splines (NURBS) surfaces, local refinement is allowed. This method avoids overfitting in domains where no further refinement is necessary. We summarize the principle of adaptive surface fitting as follows:

• The starting point is a Tensor Product (TP) B-spline space. This is used for defining the initial LR mesh and the first collection of TP B-splines.

© The Author(s) 2023 G. Kermarrec et al., *Optimal Surface Fitting of Point Clouds Using Local Refinement*, SpringerBriefs in Earth System Sciences, https://doi.org/10.1007/978-3-031-16954-0\_4


The multilevel approximation (MBA) proposed in [Lee97] is often combined with a least-squares (LS) approximation in the first steps, see Chap. 3 for more details on the procedure.

The output surface depends on different parameters that are often chosen empirically. In this chapter, we will introduce a statistical criterion called the Akaike Information Criterion [Aka73] to judge the goodness of fit of the approximation, in addition to more usual values, such as the number of point outside tolerance or the mean absolute error (MAE). We further propose to investigate how the tolerance can be chosen with respect to the level of noise in the point clouds.

The remainder of the chapter is as follows: In the first section, we will describe the penalized model selection criteria within the context of surface approximation. The student or t-distribution will be introduced to face the challenge of outliers. Dedicated examples will show the potential of AIC as a global indicator for the goodness of fit.

# **4.2 Surface Approximation and Penalized Model Selection Criteria**

To illustrate the challenges of choosing an optimal model in the sense of AIC, we will discuss two approaches: With and without a penalty term regarding the number of coefficients. We start with a data set of size *nobs*, which we approximate with an LR B-spline surface by setting, e.g., the tolerance, the maximum number of iterations, the refinement strategy, and the polynomial bidegree of the spline. We call the result of the fitting *a model* and consider *k* possible models. The vector *c<sup>k</sup>* contains the estimated coefficients and has a length *ncpk* . Each model has its own likelihood *L* (*c<sup>k</sup>* ): This associates a numerical value to the question how "likely" the model is to the observations. It is convenient to work with the log-likelihood function for the model with the estimates *c<sup>k</sup>* , which is defined as*l* (*c<sup>k</sup>* ) = log (*L* (*c<sup>k</sup>* )) . The likelihood is a measure of goodness of fit and has a meaning *only* when it is compared with another likelihood computed for another model.

#### **Approach 1 without penalty term: Case 1**

When performing a surface approximation, one could search for the optimal refinement level, i.e., the iteration step from which the algorithm should be stopped because the optimal model has been found. Here we would call model 1 the approximation at level 1, model 2 at level 2. To each model is associated a likelihood, computed from the parameter vector of estimated coefficients. As the iteration step increases, its length will increase accordingly, but the corresponding likelihood may increase only slightly. Searching for the minimum of the likelihood without penalizing for the number of coefficients could lead to an overfitting and ripples in the approximated surface.

#### **Approach 1 without penalty term: Case 2**

If we make a first approximation of a scattered point cloud with a tolerance of 0.01, we obtain a parameter vector *c*<sup>1</sup> of length *ncp*1; The approximation has a likelihood *L* (*c*1). In parallel, we can compute a second model by changing the tolerance to 0.005. Its likelihood is *L* (*c*2), with *c*<sup>2</sup> of length *ncp*<sup>2</sup> *ncp*1. For both models, we stop the refinement after 5 iterations. Usually *L* (*c*1) = *L* (*c*2) and we could state that *L* (*c*1) < *L* (*c*2). This would lead to the conclusion that the second model is more appropriate to fit the data as its likelihood is higher. This statement is partially true: The number of coefficients for the second model is much higher than for the first one. This difference may be unfavorable (i) from a computational point of view, (ii) if overfitting should be avoided due to the presence of noise in the data, or (iii) if a lean model is preferred for storage or subsequent use. A too high number of coefficients should be avoided as ripples and oscillations may occur in the fitted surface.

The penalized criteria address the drawbacks raised in the first approach. In their simple form they are called the Bayesian Information Criterion (BIC) [Sch78] or the Akaike Information Criterion (AIC) [Aka73]. The two criteria are defined as:

$$BIC\_k = -2l \left( \mathfrak{c}\_k \right) + \log \left( n\_{obs} \right) n\_{cpk} \text{ and } \tag{4.1}$$

$$AIC\_k = -2\left[l\left(\mathbf{c}\_k\right)\right] + 2n\_{cpk},\tag{4.2}$$

respectively. They can be seen as statistical alternatives to more usual heuristic considerations: The first term in Eqs. 4.1 and 4.2 is the log-likelihood, i.e., a measure of the goodness of fit to the data. The second term is a penalty term, which accounts for the increase in complexity. When *k* models are compared with each other, the model with the smallest IC is chosen. Choosing the best model within the framework of IC can be seen as finding a balance between these two quantities. The reader is referred to [Bur02] for the detailed derivation of the IC. In the following, we come back to the two cases with the second approach which accounts for a penalty term.

#### **Approach 2 with penalty term: Case 1**

For case 1, we can assume that the likelihood will saturate after a given number of iterations. At the same time, the number of coefficients will still strongly increase with each iteration step. It is likely that a minimum of the BIC and/or the AIC occurs, balancing both values.

#### **Approach 2 with penalty term: Case 2**

For case 2, only two models are compared with each other, the choice of the most optimal model is easy to meet if both *AIC*<sup>2</sup> > *AIC*<sup>1</sup> and *BIC*<sup>2</sup> > *BIC*1, i.e., it can be concluded on the superiority of model 2 with respect to model 1: A tolerance of 0.005 is more optimal than 0.01 for approximating the data at hand, within the context of model selection with IC.

Potentially the BIC and the AIC may come to two different conclusions, i.e., *AIC*<sup>2</sup> < *AIC*<sup>1</sup> and *BIC*<sup>2</sup> > *BIC*1. For case 1, this could be that the 3rd step is more optimal for BIC and the 5th for AIC. It is often stated that the BIC underestimates the optimal number of parameters to estimate. On the contrary, the assumption beyond the AIC is that the true model is unknown and unknowable. AIC is good for making asymptotically equivalent to cross-validation, and BIC for consistent estimation. In case of disagreement of the two criteria, other measures of goodness of fit should be added within the context of surface fitting, such as the MAE, the maximum error *Maxerrk* , or *noutk* .

In the following, we skip the subscript *k* for the sake of readability. We refer to Chap. 3 and recall that the following indicators to judge the goodness of fit will be used additionally:


These indicators are described in Chap. 3. Here we propose to highlight how they can be used in combination with the AIC to provide a weighted conclusion about the goodness of fit of the surface approximation.

# **4.3 Improving Information Criterion for Surface Approximation**

We consider that AIC is an adequate criterion for model selection in the field of surface approximation as the true underlying surface is unknown. The risk of underestimation of the number of coefficients with the BIC should be avoided as details may not be revealed properly. If the AIC does not have a minimum, deeper investigations could be needed by changing the setup of the surface approximation (bidegree of the splines, refinement strategy, see Chap. 3). From now, we will only consider the AIC and search for its minimum when comparing *k* models.

## *4.3.1 The Challenge of Normality*

The likelihood function is often taken to the Gaussian one, assuming the residuals of the surface approximation to be normally distributed. Unfortunately, this strong belief, when violated, can lead to a biased AIC. This compromises the correct and in-dubious determination of the AIC minimum and the choice of the most adequate model among a set of candidates. We propose to use the t-distribution (also called student's distribution), which gives more probability to observations in the tails of the distribution than the standard normal distribution [McN06]. This allows to give different weights to points outside the tolerance in the surface approximation, such as outliers. The t-distribution is defined by three parameters: μ its mean, σ its variance and ν*<sup>t</sup>* the degree of freedom of the distribution. The normal distribution is a special case of the t-distribution when the degree of freedom approaches infinity. The parameters of the t-distribution θ = [μ, σ, ν*t*] cannot be expressed in a closed form. A stable approach to estimate them is via the iterative two-steps EM (Expectation Maximization) algorithm. In that case, we assume the observations to be independently and identically distributed. The so-called E-step computes the expected value of *l* (**p**) given the observed data whereas the M-step consists of maximizing the expectation computed over the parameters to estimate. In most cases, the algorithm converges to a local maximum [Liu95].

The log-likelihood of the density of **r**, with **r** = *<sup>r</sup>*1, ...,*rnobs<sup>T</sup>* , the residuals of the surface approximation, which are assumed to come from the t-distribution *t* (μ, σ, ν*t*), is given by:

$$\log\left(L\left(r\right)\right) = n\_{obs}\left(\log\Gamma\left(\frac{\upsilon\_t + 1}{2}\right)\right) - n\_{obs}\log\Gamma\left(\frac{\upsilon\_t}{2}\right) - \frac{1}{2}n\_{obs}\log\sigma^2$$

$$+ \frac{1}{2}n\_{obs}\upsilon\_t\log\left(\upsilon\_t\right) - \frac{\upsilon\_t + 1}{2}\sum\_{i=1}^{n\_{obs}}\log\left(\upsilon\_t + \delta\left(r\_i; \mu, \sigma\right)\right) \qquad (4.3)$$

with the Gamma function and <sup>δ</sup> (*ri*;μ, σ) <sup>=</sup> (*ri*−μ) 2 <sup>σ</sup> , the standardized Mahalanobis square distance, refer to [Mah36] for more details.

The proposed AIC depends on the statistical properties of the approximation error of the fitted LR-surface: The size of the observation vector *nobs*, the number of coefficients *ncp*, the refinement strategy, the bidegree of the spline space, and the parameter of the t-distribution.

## *4.3.2 An Improved AIC for Surface Approximation*

In real applications, the true model is unknown. It is easier to assess the potential of statistical criteria such as the AIC within the framework of simulations. We have chosen the two different point clouds presented in Chap. 3 to illustrate how the AIC can be used to judge the goodness of fit, as an alternative to the usual indicators (*ncp*, *Maxerr*, *nout* , or MAE). We will investigate the optimal:


# **4.4 The AIC to Choose the Settings for Surface Approximation of Scattered Data**

The two reference surfaces used in the following correspond to (a): Smooth geometry and (b): Geometry with sharp edges. For each generated point cloud, we simulated *nobs* = 40,000 scattered data points (*xi*, *yi*,*zi*), *i* = 1...*nobs*. Both are illustrated in Fig. 4.1. In the following, all values will be given in m, if not specified differently in the text. The *z*-component of point cloud (a) is given by

$$z = \frac{\tanh\left(10\mathbf{y} - \mathbf{5x}\right)}{4} + \frac{1}{\mathbf{5e}^{(\mathbf{5x} - 2.5)^2 + (\mathbf{5y} - 2.5)^2}}.\tag{4.4}$$

The point cloud (b) is generated by letting:

**Fig. 4.1** Visualization of the generated point clouds. **a** A gaussian bell with a dam-like jump. **b** Three peaks on flat ground

$$z = \frac{1}{3\mathfrak{e}\sqrt{^{(10\mathfrak{x}-3)^2 + (10\mathfrak{y}-3)^2}}} + \frac{2}{3\mathfrak{e}\sqrt{^{(10\mathfrak{x}+3)^2 + (10\mathfrak{y}+3)^2}}} + \frac{3}{3\mathfrak{e}\sqrt{^{(10\mathfrak{x})^2 + (10\mathfrak{y})^2}}}.\tag{4.5}$$

To mimic real data, we add a Gaussian noise of standard deviation 0.002 m in the *z*-direction.

## *4.4.1 Number of Iteration Steps*

Here we investigate the optimal level of refinement with the AIC. The results are presented in Table 4.1 for point cloud (a) and Table 4.2 for point cloud (b).

For point cloud (a), the number of points outside tolerance *nout* was 0 after 15 iterations for a tolerance of 0.007 and a final MAE of 0.0016 m. The AIC has a local minimum at level 7 and a new minimum at level 11. The minimum at level 7 corresponds to the stage where the MAE saturates. This highlights the coherence of the different indicators for the smooth and homogeneous surface under consideration.

For point cloud (b), we found a minimum of the AIC at the 13th iteration level for the tolerance 0.007. Here there was no point outside tolerance after 14 iteration steps and the MAE reaches 0.0016. We note that at the 8th iteration, there is a turning point and both the MAE and the AIC saturate. Increasing the number of level of iterations leads to more coefficients (1650 at the 8th iteration step versus 2292 at the 14th) but not a strong improvement of the fitting. However, the improvement seems significant enough so that the AIC, which balances *ncp* versus the likelihood, has a weak minimum at the 13th iteration step. We link this findings with the challenging geometry of the point cloud with peaks. The results are presented in Table 4.2 together with the other indicators for the sake of comparison.


**Table 4.1** Investigation on the AIC by varying the iteration level for a given tolerance of 0.007 for point cloud (a)

The maximum *Maxerr* and MAE are given in m. FA strategy is used for refinement, bidegree (2,2)

**Table 4.2** Investigation on the AIC by varying the iteration level for a given tolerance of 0.007 for point cloud (b)


The maximum *Maxerr* and MAE are given in m, respectively. FA strategy is used for refinement, bidegree (2,2)

## *4.4.2 Refinement Strategy*

In Chap. 3, we presented a set of refinement strategies that can be implemented with LR B-splines. We will here investigate two of them in the context of optimal surface fitting using AIC to judge the goodness of fit.


The potential number of new coefficients at each iteration level is much less for FA compared to FB. For FA more iterations are expected to reach an acceptable accuracy. However, Skytt et al. [Sky15] show that this reduced pace in the introduction of new coefficients will lead to surfaces with fewer coefficients. Here the two refinement strategies can be considered as two models within the AIC framework as they are not equivalent, i.e., they lead to different residuals and likelihood. In the following, we set the tolerance to 0.007, the bidegree of the spline to (3,3) and the maximum iterations to 20. We compare two FA and FB refinement strategies to highlight the flexibility of the setting.

#### **Point cloud (a)**

We found that FB has a minimum AIC at the 7th iteration step but this latter starts to saturate at the turning point from which *ncp* begins to increase strongly (4th iteration step), see Fig. 4.2. For FA, the *ncp* increases at a slower pace compared to FB. The AIC has a weak minimum at the 15th iteration but saturates from the 6th one, as shown in Fig. 4.2.

The MAE for FA is 0.0016 after 20 iterations and a *CT* of 7.7 s. For FB, after 9 iterations and 3.8 s, the MAE reaches a comparable value of 0.00157. For both strategies, there is no point outside the tolerance at those iteration steps. Thus, FB is more favorable from a *CT* perspective. The computation times include computing the AIC. If AIC is omitted, the times are 1.10 and 0.95 s for FA and FB, respectively. However, the number of coefficients *ncp* is much higher for the FB strategy (7024 vs. 4655 at the optimal iteration step). To compare, 4932 coefficients had to be estimated at the 4th iteration step with the FB strategy, and 173 points were still outside tolerance versus 2 points for the FA and 4655 coefficients.

We further note that the minimum of AIC for FB at the (from the AIC perspective) optimal 7th iteration is higher than for FA (− 416,573 vs. − 419,125 for FA at the optimal 15th iteration). This difference would indicate that FA is more optimal from a statistical criterion perspective than FB. This choice has to be weighted from a practitioner perspective, i.e., answering the question if more accuracy is needed or not, if the *CT* is an important criterion or not, and taking into consideration the challenge of overfitting. There is no definitive answer as the truth does not exist. It is a question of interpretation.

**Fig. 4.2** Results of the approximation for the two refining strategies FA and FB, point cloud (a). **a** AIC. **b** *ncp*

#### **Point cloud (b)**

For the point cloud (b), we find that FB has a minimum AIC at the 7th iteration step. It starts to saturate at the turning point (5th iteration step) from which *ncp* begins to increase strongly, see Fig. 4.3. As the MAE, the AIC for FA and FB reaches a weak minimum, which indicates a fitting that can hardly be considered as optimal. We found that the MAE and the AIC for FA reach slightly lower values than for FB: for the MAE we found 0.0016 versus 0.0015 and for the AIC − 421,830 versus − 411,504 for FA and FB, respectively. The AIC can, thus, allow to conclude on the superiority of the FA strategy for point cloud (b) but the results differ slightly if we only consider the MAE as a criterion to judge the goodness of fit. This highlights the importance of accounting for *ncp* to balance the likelihood. However the difference from a computational point of view for FA to reach the minimum is significantly higher than for FB: 1.437 s for FB versus 4.523 s for FA. The recording of the computational time includes computation of the AIC.

The previous results would tend to indicate that FA is more optimal than FB for fitting point clouds with LR B-spline surfaces. This is partially true and has to be weighted against *CT* . The the two examples clearly highlights that the fitting with the FB strategy produces more coefficients than FA for a similar accuracy while FA has a higher *CT* than FB. Still, this justifies our choice of using FA in the previous (and following) sections without lack of generality. This highlights, also, that new criterion should be found that also would also account for the *CT* to judge and balance the goodness of fit.

**Fig. 4.3** Results of the approximation for the two strategies FA and FB, point cloud (b). **a** AIC. **b** *ncp*

## *4.4.3 Tolerance*

A proper tolerance is important for surface fitting: A large tolerance will make the process faster but may lead to underfitting, a smaller tolerance will increase the accuracy of the fitting result but costs more time, i.e., the fitting surface will be more complex, not to speak of the risk of overfitting. Hence, we can use AIC as the criterion to compare with the usual indicators and weight the number of parameters versus global accuracy. The fitting with minimum AIC is the optimal tolerance in a global sense.

In this section, we show the potential of AIC for investigating the tolerance. Here the standard deviation of the noise is taken as previously to be 0.002 m in the *z* direction. We vary the tolerance within a range from 0.005 to 0.011. We use refinement strategy FA and polynomial bidegree (2,2) and focus on point cloud (a). Similar conclusions could be drawn for point cloud (b) and are not presented here. Table 4.3 gives the AIC, as well as the iteration level with no point outside tolerance. For each tolerance, we set the number of maximum iteration steps to 20. For example, when the tolerance is 0.01, the approximation will continue until 14th iteration step, but the minimum AIC is reached at the 7th step. The AIC decreases with the tolerance and has a minimum for a tolerance of 0.007, which is illustrated in Fig. 4.4. This value was the optimal tolerance chosen for Table 4.1. We further note that the MAE stays around 0.0017 for all tolerances at the optimal number of iterations, and has a weak minimum at a tolerance of 0.006. This result is compatible with the results given by the AIC.


**Table 4.3** Investigation on the AIC by varying the tolerance

## *4.4.4 Polynomial Bidegree of the Splines*

In this section, we vary the bidegree of the splines from (2, 2) (biquadratic) to (3, 3) (bicubic), which are usual choices for performing surface fitting. This corresponds to two different models within a model selection framework. We consider point cloud (a) and (b) and use the FA strategy for refinement, as well as a tolerance of 0.007. For point cloud (a) and for the optimal refinement level, we found that the biquadratic setting leads to a minimum of the AIC compared to the bicubic one (− 419,130 vs. − 419,125). From the MAE perspective, we found a value of 0.0016 for both settings at the optimal iteration step for the AIC (11th for the biquadratic and 15th for the bicubic respectively). The MAE does not decrease significantly for higher iteration steps, and, thus, does not allow to conclude in favour of a biquadratic or bicubic surface. Furthermore, a low MAE can be risky, i.e., linked with an overfitting. Here the AIC with its minimum, even if weak, has an evident advantage over the MAE to find an optimal iteration level, by weighting the likelihood with the number of coefficients.

We have the same conclusion for point cloud (b). Here the minimum of the AIC is smaller for the bidegree (2, 2) (− 422,672 vs. − 421,830) but the MAE is similar for both optimal iteration steps corresponding to the minimum of the AIC (17 for the bicubic and 14 for the biquadratic).

Skytt et al. [Sky15] mentioned that in most cases a biquadratic surface will suffice, which is in accordance with our results. Thus, in most cases a higher bidegree of the polynomial doesn't contribute to a better accuracy of fitting LR B-spline surfaces for this type of data sets and noise levels.

## *4.4.5 Optimal Tolerance Versus Noise Level*

Depending on the sensors and the conditions under which they are used, the noise level will vary. For a terrestrial laser scanner, the noise level of the range is known to depend on the intensity, i.e., the power of the backscattered laser signal recorded by the instrument after reflection. Atmospheric effects may also act as correlating the observations, i.e., decreasing the effective number of observations [Ker20]. The noise is often characterized by its standard deviation, a quantity which can be provided by the manufacturers. We can conjecture that a high noise level leads to a point cloud that is more challenging to fit optimally, with a strong risk of overfitting. Here we understand under overfitting "fitting the noise" instead of the true underlying surface. This effect is unwanted as it can give surfaces with ripples and oscillations [Bra20]. A wise choice of the tolerance can avoid or strongly mitigate the risk of overfitting. Thus the tolerance is an important parameter which is usually fixed rather empirically. Often, a low MAE is searched. Unfortunately, an artificially small error is not automatically linked with a high accuracy for fitting the underlying point cloud: In case of noise or outliers in the observations, even the contrary may happen.

We propose to investigate the choice of an optimal tolerance in the context of model selection, searching for a minimum of the AIC. To that end, we simulated different Gaussian noise vectors added to the reference point cloud. Their standard deviation was varied in a range of values between 0.001 m (low level of noise) and 0.0045 m. The noised surfaces were fitted with an LR B-spline surface. We chose the FA strategy and a biquadratic surface, following the results of the previous sections.

Here we vary the tolerance for a given noise level and search for the minimum AIC. Each AIC is computed at the optimal iteration step. We place ourselves in the framework of Monte Carlo simulations by simulating each time 100 noise vectors and taking the mean over all indicators.

The results of the investigations for point cloud (a) are presented in Fig. 4.5.

Figure 4.5 highlights that the optimal tolerance found with the AIC depends on the standard deviation of the noise level. As the noise level increases, the optimal tolerance increases, and so the AIC. We found a linear dependency of the optimal tolerance with respect to the noise level with a slope of 3 (left axis in Fig. 4.5a). This

**Fig. 4.5** Performance indicator versus noise level. **a** Optimal tolerance (left axis) and optimal AIC (right axis) versus noise level (std in m). **b** MAE (m) versus noise level

slope is slightly lower (close to 2) as the noise level increases. A similar result was found for point cloud (b) and is not presented here. The slope of 3 can be justified as corresponding to 3 times the standard deviation of the noise, i.e., this is the interval in which 68% of the measurements will fall assuming their normal distribution. We found that the number of optimal iteration steps stays between 6 and 7 and decreases as the noise level increases. This is an important finding as it is unnecessary -if not risky- to continue the adaptive refinement for noisy point clouds. This is what the AIC tells us. We further computed the MAE at the iteration step considered as optimal from the AIC, see Fig. 4.5b. We found a linear dependency, with a slope of 0.78. This latter is less predictable than the previous one regarding the noise level and will depend on the point cloud under consideration.

Following these results, we propose to choose the optimal tolerance as being 2.5 times the noise level. This is a good compromise when the noise of the sensor is unknown. Three times the noise level would be even more conservative and has to be weighted against a potential loss of accuracy.

## **4.5 Conclusion**

In this chapter, we have introduced a statistical criterion as a new tool to judge the goodness of fit of the surface approximation. An information criterion is a weighted measure between the quality of fitting and the number of coefficients that are to be estimated. We showed how the AIC can come into play for determining the optimal level of refinement, the optimal bidegree of the spline, or the choice of the refinement strategy. Exemplary, we found that a biquadratic surface is optimal for a smooth point cloud. The tolerance is often fixed empirically with the aim to have a low RMSE or MAE. We found by investigating the AIC, that the optimal tolerance depends linearly on the noise level of the point cloud and can be fixed to 2.5 times the standard deviation of the noise of the observations. This information is often given by the manufacturers or can be guessed based on the residuals of the approximation and/or previous investigations. Thus, we have provided an answer to the question of the optimal tolerance with respect to the data at hand. These results will be used in Chaps. 5 and 6.

The use of the AIC to judge the goodness of fit is beneficial when many coefficients are needed to fit a point cloud: It avoids unnecessary steps and a possible overfitting. The AIC remains a global statistical quantity which has to be combined with other indicators, depending on what "optimality" should be for the application under consideration. For point clouds with high variability and local changes, the AIC only gives a global indication about the fitting.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 5 LR B-Splines for Representation of Terrain and Seabed: Data Fusion, Outliers, and Voids**

**Abstract** Performing surface approximation of geospatial point clouds with locally refined (LR) B-splines comes with several challenges: (i) Point clouds have varying data density, (ii) outliers should be eliminated without deleting features, (iii) voids, also called holes, or data gaps should be treated specifically to avoid the drop of the approximated surface in domains without points. These factors tend to be even more challenging when point clouds acquired from different sensors having different noise characteristics are fused together. The data set becomes non-uniform and the fusing process itself involves a risk of an increased noise level. In this chapter, we provide some tools to answer those specific challenges. We will use terrain and seabed data and show didactically how to perform adaptive surface approximation with local refinement and to select customized parameters. We will further address the problem of choosing an appropriate tolerance for performing an adaptive fitting, and discuss the refinement strategies within the context of LR B-splines. The latter is shown to provide a promising framework for surface fitting of heterogeneous point clouds from various sources.

**Keywords** Adaptive refinement · Surface fitting · Outliers · Voids · Trimming · Tolerance · Bathymetry · Data fusion · LR B-splines

# **5.1 Introduction**

In this chapter, we will discuss a number of challenges that occur when working with geospatial point clouds, e.g., outliers, data fusion or domains without points. We will illustrate the discussion with a terrestrial LIDAR point cloud, or a seabed sonar point cloud and the combination of both. We aim to represent the area covered by both point clouds with a smooth parametric surface. The focus is on LR B-spline surfaces, which were shown to be appropriate within this context, see Skytt and Dokken [Sky22] for prominent examples.

The data acquisition of terrains and seabeds produces huge point clouds. The structure—or lack of structure—in the point clouds depends on the technology used to acquire them, i.e., on the sensor under consideration, may it be a terrestrial laser

© The Author(s) 2023

G. Kermarrec et al., *Optimal Surface Fitting of Point Clouds Using Local Refinement*, SpringerBriefs in Earth System Sciences, https://doi.org/10.1007/978-3-031-16954-0\_5

scanner, a single or multibeam sonar. An efficient downstream use of the acquired data requires structured and compact data representations. Locally refined (LR) Bspline surfaces are smooth and flexible surfaces: They provide a middle road between the rigid but effective regularity of raster surfaces, and the highly flexible triangulated surfaces, see Chap. 1 for a detailed comparison. The LR B-spline surfaces are found to be convenient for representation of terrains and seabeds [Sky15, Sky16], as they accurately represent the smooth part of the data while having the flexibility to adapt to local shape variations without globally increasing the data size of the mathematical surfaces.

An LR B-spline surface belongs to the class of locally refined spline surfaces. It is a piecewise polynomial surface defined on a rectangular domain composed of axes parallel rectangular boxes (a mesh). In contrast to a tensor product (TP) spline surface, the boxes do not need to form a regular pattern. The concept of LR B-splines is described in detail in Chap. 2, which also includes an overview of alternative Bspline based locally refined surface methods.

The starting point for defining an LR B-spline surface is a TP B-spline surface.The adaptive refinement procedure inserts new meshlines into the surface description where the surface does not meet a prescribed accuracy requirement. The meshlines must satisfy the rule:

A new meshline must split the support of at least one TP B-spline implying that the refinement increases the number of TP B-splines with at least one.

Refinement is performed in an iterative algorithm described in Chap. 3. First, the accuracy (L1 norm) of a current surface with respect to a given point cloud is computed. In a second step, the surface is refined where the accuracy does not meet the requirements. This process is repeated until the accuracy is found sufficient, which means that it does not exceed a predefined tolerance. Alternatively, the algorithm is stopped by some other constraint, normally the number of iterations. Figure 5.1 summarizes the surface approximation algorithm.

In the following, we propose to illustrate the fitting of a data set combining observations from terrains and seabeds. We will highlight how to solve different challenges that arise in real cases, such as


We will describe in detail the methods used to guide the user through methodological answers to real problems. More precisely, in a first section, we will present the data set that will be approximated. Next, outlier detection procedures will be compared. We will explain the concept of bounding the coefficients. Next the advantages of the Multilevel B-spline Approximation (MBA) will be highlighted. We will conclude by explaining the concept of trimming to deal with data voids. An appendix is dedicated to the output format of the LR B-spline surfaces.

**Fig. 5.1** Approximation of a point cloud by an LR B-spline surface with adaptive local surface refinement

**Fig. 5.2** The island Fjøløy in Norway

## **5.2 Description of the Data Set**

Fjøløy is a 2.1 km2 island in the municipality Stavanger at the south west coast of Norway, see Fig. 5.2. In this area terrestrial and bathymetry data are both available and represented in the same coordinate system. We select a set of corresponding land and sea data from the Fjøløy area, see Fig. 5.3. The terrestrial data set was obtained in 2016 with LIDAR. The bathymetry data was acquired in 2013 by a multibeam

**Fig. 5.3** Selected data sets from land (brown) and seabed (grey). **a** Terrestrial points, **b** bathymetry points, and **c** both

sonar and released for public use in the context of the project "Marine grunnkart pilot (Marine base map pilot)". The boat with the sonar has no access at very shallow water: There is a zone between land and sea where data is lacking (voids). In an ongoing project, data from this zone are acquired with LIDAR bathymetry.

The terrestrial data set shown in Fig. 5.3a, consists of 2,579,974 points containing both land and the sea surface. Several buildings, trees and stones can be found in the covered area. The data set has been classified prior to reception. 73 points were identified as outliers, 1,643,865 as ground points and 936,036 points are unclassified. The outliers are quite distinct. We consider this set as the reference to which we compare our outlier detection methods. The bathymetry data set, Fig. 5.3b, contains 25,107,199 unclassified points. Here no obvious outliers have been identified. The point cloud covers an area of 800 × 600 m, and the height range of the terrestrial data is [− 64.75, 156.32] m, including outliers. After the removal of outliers, the range is [−0.74, 76.4] m. The height range of the bathymetry data is [−21.13, −1.2] m. The total area covered by the combination of land and sea point clouds is just below 0.5 km2.

We want to compute one LR B-spline surface combining both land and seabed as illustrated in Fig. 5.3c. To that aim:


## **5.3 Outlier Detection**

An outlier is defined as an observation that lies at an abnormal distance from other values in a random sample from a population. Different strategies exist to eliminate them.

## *5.3.1 Strategies for Outlier Detection*

Outliers in a data set can come from contaminated data samples, incorrect sampling methods, errors coming from the sensor during data collection or analysis [Haw80, Cha17]. Outliers in large geospatial data set can largely influence the results of the surface fitting, i.e., they will be adjusted as normal observations. If not excluded prior to the approximation, ripples or non-smooth surfaces are likely to arise.

The outliers are categorized as sparse outliers, or can come in clusters [Wan15], in which case they are more challenging to filter. Visual approaches are not suitable for large point clouds and automatic detection should be preferred. Some of the most popular methods for outlier detection for light detection and ranging (LIDAR) point clouds are reviewed in, e.g., Sotoodeh [Sot06]:


# *5.3.2 Comparison of Outlier Detection Methods for the Selected Data Set*

The terrestrial data set presented in Sect. 5.2 contains both outliers and unclassified points representing houses, trees, low vegetation and similar objects. Our aim is to identify outliers, but it is also interesting to see the extent to which outlier detection methods separate ground truth from vegetation and man-made objects. We will investigate the IQR and Z-score methods, as well as a method for detecting single outliers. The outlier detection is applied in the context of adaptive approximation of height data, see Chap. 3. The selected methods do not require any apriori estimate of the number of outliers. Moreover, they can be easily applied to a high number of data points.

The three methods are integrated in the adaptive approximation algorithm and applied at each iteration step in a regression setting. The distances between the points and the current approximating surface are compared making it a regression based method. The outlier detection is applied to subgroups of the data set identified by selecting the points situated in one mesh cell. The group testing is intended to reduce the computational effort in outlier detection. The point groups are subject for testing only if:


The accuracy results from the last iteration step are obtained after outliers removal. At the start of the computation, it is hard to distinguish between features and outliers. As the surface is adapted to the point cloud, the distance in features will be smaller than for outliers; the criterion for allowing an outlier test is gradually decreased at each iteration step.

#### **5.3.2.1 The IQR Test**

Here the residuals correspond to a subset of the point cloud and the current surface are sorted according to their values. We call *Q*1 the first quartile of the residuals and *Q*3, the third one. Then *IQR* = *Q*3 − *Q*1 is the interquartile range of the residuals. We further denote two fences *f* 1 = *Q*1 − *f actor* × *IQR* and *f* 2 = *Q*3 + *f actor* × *IQR*. An outlier is defined as a point with a residual value outside the range of these fences. Often *f actor* = 1.5, justified by assuming that the residuals follow the normal distribution. This factor gives fences at μ − 3σ and μ + 3σ, where μ is the mean and σ the standard deviation. This way 0.28% of the points are expected to be defined as outliers. Unfortunately there is no reason to believe that the residuals are normally distributed. A student distribution with a heavier tail is more probable,


**Table 5.1** Outlier detection with IQR, various factors

For each iteration step the maximum *Maxerr* and average distance (MAE) is reported along with the total number of mesh cells *ncell* , the number of cells tested for outliers *ntest* and the total number of identified (found) outliers found*tot* . At each level, 2 example cells are selected reporting the number of points, fences and the number of identified outliers in the cell

see Chap. 4. As an assumption of the student distribution implies a more demanding computation to find the correct factor, we apply also the factors of 3 and 5 to our outlier detection and study the effect of this factor on the selected data set.

Table 5.1 shows some results for outlier detection with the IQR method. The total number of outliers identified is 29,129, 8,389 and 2,615 for an IQR factor of 1.5, 3 and 5, respectively. All obvious outliers are caught together with a certain amount of vegetation and house points, depending on the factor under consideration. The example cells at the first iteration step are the same for all factors, and we see that the number of outliers found are reduced with increasing factor. Similar distances between the subset of the point clouds and the surface lead to a very diverse number of outliers. Fortunately, this is not necessarily a problem: Large distances can also be synonymous with a low accuracy due to lack of freedom in the surface. This occurs typically at the beginning of the adaptive process when the steepness and roughness of the terrain varies in the selected area.

Figure 5.4 shows the result of the outlier classification. We see that much of the vegetation, buildings and some points at the sea surface are also classified as outliers for factor = 1.5, in addition to the obvious outliers (which are always found). In areas where the majority of the points belong to trees, see Fig. 5.4b, the position of the surface is influenced by the vegetation points as well as the ground points.

**Fig. 5.4** The results of outlier detection with IQR, blue points are classified as outliers while the remaining points are light blue. **a** IQR factor 1.5, **b** detail with factor 1.5, **c** factor 3, **d** factor 5

Ground points and vegetation points become equally likely to be classified as outliers. When the IQR factor is increased, the part of the vegetation classified as outliers is decreased, but not eradicated, as illustrated in Fig. 5.4c and d.

The IQR outlier detection method removes many points and badly assumes that the data is normally distributed. Classification of points from vegetation and buildings should be done with more accurate methods. However, this method is simple and can give useful results if applied with care.

#### **5.3.2.2 Z-score**

Similarly to the IQR algorithm, the Z-score method is based on the assumption that the data are normally distributed. *Zi* <sup>=</sup> *ri*−<sup>μ</sup> <sup>σ</sup> where *ri* is residual number *i*, μ is the residual mean and σ the standard deviation. If the size of the residual is outside the range [−3, 3], it is considered as an outlier. In theory, this should give the same result as the IQR test with factor 1.5 for a normally distributed data set. In this method the mean and standard deviation are computed explicitly; this puts less assumption on the distribution. Table 5.2 shows how outliers are detected with the Z-score method


**Table 5.2** Outlier detection with the Z-score method

The global maximum and average distance between the surface and the points are reported along with the number of cells in the surface mesh and the number of cells tested for outliers. The standard deviation range applies to the tested cells and the number of outliers found the current iteration is reported in the last column

**Fig. 5.5** Results of outlier detection with the Z-score method. Blue points are classified as outliers. **a** The result after one iteration, **b** the result after three iterations

during the adaptive surface approximation. The method classifies fewer points as outliers than the IQR method. The total number detected was 5487. In the first iteration step, only 1 or 2 points are found to be outliers in 27 of the cells tested. A point with a very large residual shadows for other points that may also be considered to be an outlier. These points can be found only when the most extreme cases are removed.

Figure 5.5 highlights that the evident outliers are detected in the first iteration step together with a few points corresponding to vegetation or buildings. In later iterations, more points close to the ground are added. Some points belonging to trees, bushes and houses are classified as outliers. Unfortunately, some points from a tree may be found to be outliers and some may not.

#### **5.3.2.3 Detection Aimed at Single Outlier Points**

The last outlier detection method to be investigated in this chapter is designed to fit within the context of adaptive surface approximation with local refinement such as, e.g., LR B-splines. It mainly aims at identifying single outlier points and has no direct link to the aforementioned statistical methods.


**Table 5.3** Outlier detection aimed at single outliers

The global maximum and average distance between the point cloud and the surface are given along with the total number of cells, the number of cells where outlier detection is applied, the threshold for outlier detection and the found number of outliers

The points in a cell with a residual larger than the threshold are called candidate outlier points. The threshold is used in a pre-processing step to check the cells for possible outliers.

Each candidate outlier is compared to a group of nearby points not restricted by the cell boundaries. The number of points in this group varies, but should be close to 100. A set of characteristics is computed for the group of nearby points, both including and excluding the candidate outliers, to decide if they should be excluded:


For a candidate point to be classified as an outlier, the following rules must apply: *n*w*ith* − *n*w*ithout n*w*ith*, *std*w*ith std*w*ithout* , *MAE*w*ith MAE*w*ithout* and *R*w*ith R*w*ithout* . Furthermore, let *zo* be the elevation of the candidate outlier point and *z <sup>p</sup>* of the closest neighbouring points and *ro*, and let *r <sup>p</sup>* be the residual sizes for the two points. Then |*zo* − *z <sup>p</sup>*| > 2 × *tol* and |*ro* − *r <sup>p</sup>*| > 2 × *tol* where *tol* is the approximation tolerance. Moreover, a steep slope between the candidate outlier and the neighbouring point is required. The combination of these criteria implies that groups of outliers will be detected only if the group contains few points and/or is very deviant from other points in the neighbourhood.

Table 5.3 shows the number of points identified as outliers along with some additional information. We note that the number of outliers is much lower compared to the previous methods. After the most prominent outliers have been removed in the first step, the outlier threshold is reduced significantly. In the first step, the number of candidate outliers in the cell is one or two, and all candidates are classified as outliers. In the second and third step, the number of candidates in a cell varies from 1 to 101 and in most cases no outliers are detected. Groups of points belonging to houses, trees and other vegetation are tested and found not to be obvious outliers.

Figure 5.6 shows the location of the identified outliers. Mostly, the obvious cases are detected although a few points related to vegetation are included. As mentioned in the introduction, the data set contains 73 classified outliers, which were identified in a preprocessing step. The algorithm found 55 outliers where 49 also belong to the group of classified outliers. The current method is best adapted to the problem

**Fig. 5.6** Results of outlier detection with the method aimed at single outliers. Blue points are classified as outliers. **a** The final result, **b** a detail

at hand (elimination of trees and vegetation), but it is complex and has a limited theoretical background. Nevertheless, it seems that a tailor made outlier detection is beneficial when combined with the adaptive method for surface generation.

## **5.4 Surface Approximation of the Selected Data Set**

We use the points classified as ground from the terrestrial data set and remove the data points at the sea surface. This means that only points with a positive height component are included. The ground data is combined with the corresponding seabed data set resulting in the point cloud shown in Fig. 5.7. We notice that there are some shallow water areas where points are missing. Furthermore, the point cloud density is considerably higher for the seabed part compared to the terrain one: There are about 25 times more bathymetry points than terrestrial.

**Fig. 5.7** Combination of terrain and sea data

## *5.4.1 Selection of Methods and Parameters*

Figure 5.1 gives an overview of the surface approximation algorithm. The process starts from a TP B-spline surface, which is adaptively refined in areas where the distance between the surface and the point cloud is larger than a given tolerance. Given a current LR B-spline surface, we can perform the actual approximation with a leastsquare (LS) approach or multilevel B-spline approximation (MBA), see Sect. 3.3. LS approximation is a global approach with some best fit properties while MBA is an iterative explicit local approximation method. The method is to some extent expected to smooth out extreme behaviour in the approximating surface. We normally apply LS approximation for a number of iterations in the adaptive algorithm before turning to MBA.

Data sets are subject to noise and may contain outliers. It is, thus, not obvious that the approximation should be pursued until all points have a distance to the surface smaller than a given tolerance. Normally, the process is stopped by a maximum number of iteration steps, but the finding the optimal number of iterations is challenging. Computing the minimum of AIC is an alternative to find this optimum, but the process is time consuming. Moreover, it is a global method that does not take local variations in the point cloud into account, i.e., a minimum does not always exist. A tolerance is applied to identify where the surface needs to be refined. This value should be defined depending on the measurement accuracy, information that is not always known. Also, the actual selection of new meshlines to insert influences the accuracy and number of coefficients in the final surface. Various refinement strategies are discussed in [Sky22] and a short resume is given in Sect. 3.2. In the remainder of this section, we will discuss the selection of methods (MBA, LS, combination of both, refinement strategy) and parameters (tolerance, number of iterations) for the selected data set.

#### **5.4.1.1 LS Approximation Versus MBA**

Figures 5.8 and 5.9 compare (i) the approximation with LS until about 33,000 coefficients are estimated, and switch to MBA, and (ii) MBA for the entire computation using a tolerance of 0.5 m. We see that for the LS approximation, both the number of unresolved points and the average distance in these points are lower than MBA for the same number of coefficients. The difference is the largest for few coefficients and diminishes when the number of coefficients increases.

Figure 5.10 shows the approximating surfaces using LS and MBA. The difference is small, but looking at Fig. 5.11, it is clear that MBA offers a smoother transition in areas with no point. LS approximation should be applied early in the approximation process for data sets with relatively uniform density, a low noise level and no outliers. For non-smooth data sets with voids, MBA should be the preferred choice.

When fitting point clouds with spline surfaces, an overshoot in areas with steep gradients may arise, in particular with unevenly distributed data points. On the other hand, the surface is bounded by its coefficients due to the property partition of unity.

**Fig. 5.9** Accumulated distance in points with distance more than 0.5 m scaled with a factor of 1/10,000

**Fig. 5.10** Results of surface generation, LS = green, MBA = brown. **a** The surfaces are roughly similar, **b** the point set is included in the figure to emphasize the areas without points

**Fig. 5.11** Focus on areas without points. **a** Approximation with LS, **b** using MBA

By limiting the surface coefficients to a range slightly larger that the height range of the data set, extreme overshoots can be avoided. Figure 5.12 focuses on a subset of the point cloud covering a part of the area depicted in Fig. 5.11. The subset contains 1,494,242 points, which are shown in Fig. 5.12a. The area is rough, includes parts without points and has steep climbs from the seabed to two islands. The adaptive approximation procedure starts from a biquadratic surface without inner knots and is allowed to continue for 12 iterations with refinement in alternating parameter directions. Using MBA, the result is almost similar when the size of the coefficients are bounded or not, see Fig. 5.12b and c. This is not the case for the LS approximation. When the coefficients are bounded to a range slightly larger than the elevation range of the data point, the resulting surface depicted in Fig. 5.12d is quite well behaved in the areas without point although less smooth than the MBA surfaces. Without a bound on the coefficients, the surface oscillates drastically in areas without points (Fig. 5.12e).

The LS approximation is combined with a smoothing term to ensure a solution in areas without points, see Chap. 3 for more details. The weights on the approximation term and the smoothing term sum up to one. Normally, the weight on this term is kept low to emphasize approximation. We apply a higher weight (0.1) to study the effect of smoothing in challenging configurations as shown in Fig. 5.12f. The extreme behaviour in Fig. 5.12e is avoided, but the surface is generally less smooth than the alternatives shown in Fig. 5.12b, c and d. The approximation accuracy is lower when a high weight on the smoothing term is applied. It must be noted that the approximation errors increase in the last iteration step in theses cases. Otherwise, the accuracy does not differ much between the various approaches, see Table 5.4. LS approximation may become less accurate when the LR mesh gets very unstructured. Then the algorithm switches to perform approximation with MBA. We stop the iteration just before this situation occurs so the results are achieved with LS

**Fig. 5.12** Focus on areas without points. Approximation with different selections of approximation method. **a** Data points, **b** approximation with MBA and bound on the coefficients, the surface is light blue and the points can be glimpsed in clear blue, **c** approximation with MBA and no coefficient bounds, **d** LS approximation with coefficient bounds, **e** LS approximation, no coefficient bounds, **f** LS approximation with high weight on the smoothing term (0.1) and no coefficient bounds


**Table 5.4** Accuracy of the subset of the point cloud with LS approximation and MBA

The weight on the smoothing term for LS is 1.0e−<sup>9</sup> and 0.1. The tolerance is 0.5 m *<sup>a</sup>*Smoothing term has weight 0.1

approximation or MBA, purely. We note that the bounds on the surface coefficients do not hamper the approximation accuracy.

#### **5.4.1.2 When to Stop the Iteration**

Figures 5.8 and 5.9 indicate that the gain in continuing the approximation after the surface having 20,000–30,000 coefficients is small. For approximation with MBA, the maximum distance decreases from 3.782 to 3.426 m, the average distance from 0.100 to 0.073 m and the fraction of points outside the tolerance from 1.9 to 0.57% when the number of coefficients increases from 21,572 to 76,110 and the computation time from 3 min. 24 s to 4 min 28 s.We refer to Table 5.5 for the accuracy development for an increasing number of iterations.

When searching for an optimal surface approximation, a balance has to be found between the number of iterations, the MAE and other performance indicators, i.e., the maximum distance and the computational time for a given tolerance. The choice is let to the practitioner: This latter should judge the risk of fitting the noise as the number of iterations increases. Here an indication can be provided by searching the minimum of AIC, see Chap. 4. In our particular case, no minimum could be found: We link the lack of minimum with the fact that the surface contains many details and is not smooth enough, i.e., a global criterion on its own is not sufficient to judge the goodness of fit.

#### **5.4.1.3 Tolerance and Accuracy**

A main concern regarding surface fitting is linked with the accuracy of the approximation. This is especially important in areas like seabed shallows, while the noise level may be high at shallows due to sea vegetation and a narrow sonar width resulting in multiple traversals by the boat carrying the sonar. The surface should accurately represent the main shape of the terrain, but not necessarily adapt to every little stone. The tolerance is used to determine where the surface needs refinement and consequently the achievable accuracy. It is a predetermined value that should reflect the precision of the measurement. A level of 2–3 times the measurement error can be considered appropriate as discussed in Chap. 4. This is a first indication as the real error is normally larger than the precision of the measurement device, which is not always known. Several scans are merged and arbitrary objects, like power lines and fishes, may influence the result. Here we investigate the impact of the threshold on the fitting.

The surface approximations in Figs. 5.8, 5.9, 5.10 and 5.11 were performed with a tolerance of 0.5 m. The algorithm was allowed to run for 12 iterations, and all mesh cells where the maximum distance between the surface and a point in that cell exceeded the tolerance triggered refinement. All B-splines with the cell in its support were refined in one parameter direction at the time, in the x-direction at odd levels and the y-direction at even levels. This corresponds to the refinement strategy called FA, see Chap. 3 for more details. The MAE dropped below the tolerance at iteration level 2 for both LS approximation and MBA, and touched 0.1 m at level 9. The tolerance of 0.5 m is selected somewhat arbitrary, but is found to balance surface size and accuracy.


**Table 5.5** Tolerance, number of iteration steps, MAE, number of coefficients and percentages of points with a distance to the surface in specified ranges

The applied refinement strategy is FA and surface approximation with MBA is applied. Distances are given in m, and |*d*| denotes the absolute value of the distance between a point and the surface

Table 5.5 presents some accuracy results for a selection of tolerances and maximum iteration levels. The setup used in Figs. 5.8, 5.9, 5.10 and 5.11 is highlighted with bold font. The difference in accuracy between the applied tolerances is remarkably small while the numbers of coefficients differ greatly when a high number of iterations is applied. In the first iteration steps, the selected tolerance plays a limited role. The approximation error indicates similar refinements for all applied tolerances.

Figure 5.13 shows that the configuration of points with a residual value smaller than or larger than 0.4 m is relatively similar for the tolerances 0.1 and 0.6 m. Some differences can be spotted mainly due to an increase in point size for the points with a

**Fig. 5.13** Point cloud coloured according to the distance to the surface. White points are closer than 0.4 m, green points lie below the surface and red points above. More saturated colour means larger distance. The size of the white points are reduced compared to the coloured points. **a** Tolerance 0.1 m, 12 iterations, **b** tolerance 0.6 m, 12 iterations

distance larger than 0.4 m in the picture. The surface adapting to a tolerance of 0.1 m has more points within this tolerance belt than the other surfaces, but the difference is negligible compared to the difference in the number of surface coefficients. The percentages of points within this small belt after 12 iterations is 79.3, 78.4, 77.4 and 75.8% for tolerances of 0.1, 0.4, 0.5 and 0.6, respectively. The roughness of the data does not allow such a tight approximation with a smooth surface. The majority of the points with a high distance to the surface belong to the seabed. This can be caused by the number of bathymetry points being much higher than terrestrial points, but also from the bathymetry points being unclassified whereas terrestrial points are classified as ground. The descent is most prominent in shallow seabed areas.

#### **5.4.1.4 Refinement Strategies**

In Sect. 5.4.1.3, we saw that a tighter tolerance increased the number of surface coefficients considerably at later iteration levels without improving the accuracy significantly. The effect of the extra refinement is low. Similar results were also found in [Sky22]. A rapid introduction of new meshlines leads to more coefficients for similar accuracy, but also a lower computational time. A slower pace in introducing new degrees of freedom often led to few coefficients and an acceptable computation time, while a very restrictive introduction could block further accuracy improvements and eventually lead to more surface coefficients that contribute little to an accurate approximation.

Table 5.6 illustrates how different refinement strategies for defining new meshlines influence the approximation results.We stop the iteration after the surface has reached 20,000 coefficients. The number of iterations required is reported in column two. For the strategies whose name starts with F and Mc, the refinement is triggered by mesh cells that contain points with a residual value larger than the tolerance. For strategies starting with S and R, refinements are triggered for B-splines having such points in their support. If the strategy is marked by "all", all such occurrences will lead to refinement while "tn" indicates that only mesh cells or B-spline supports with a relatively high number of out-of-tolerance points combined with a large distance to the surface will trigger refinement. Strategies marked with B will refine in both parameter directions at each iteration step while strategies marked with A will refine in alternating parameter directions. Strategies starting with F are full span strategies meaning that all B-splines having the identified cell in its domain are split. Mc are minimum span strategies. Here, only the one B-spline is defined to be refined and the criterion is a combination of size and number of associated out-of-tolerance points. For S strategies the identified B-spline is refined in all knot spans, while for R the knot spans containing most out-of-tolerance points are refined. McA tn is the most and SB the least restrictive refinement strategy in the list. We refer to Chap. 3 for more details on each refinement strategy.

In Table 5.6, we compare the results for the different strategies after the last iteration. We see that the "A" strategies always need a higher computational time than the "B" since the refinement in "A" is performed in each direction separately


**Table 5.6** Refinement strategies and associated accuracy results

The iteration is stopped after 20,000 surface coefficients (*ncp*) is reached. Distances are reported in m and computational time in min and s. The approximation efficiency is computed as number of points with a distance less than the tolerance (*nin*) divided by the number of coefficients (*ncp*). Thus, a high efficiency number is beneficial. The tolerance is 0.5 m and the number of data points is 26,076,683

implying that the number of coefficients to estimate is higher. However, the final number of coefficients relative to the accuracy tends to be lower for the "A" methods and the efficiency is higher. For this data set, SB has the lowest computational time but a high number of coefficients and the poorest efficiency among the recorded strategies. The best approximation efficiency is found for FA (marked with bold font). However, the efficiency does not take the value of the residuals into account as long as it is smaller than the prescribed tolerance; here the actual distance could be considered as well. We see that some methods will have lower computational time than FA. Thus, if the time is regarded as more important than the number of surface coefficients, FB and McB are good alternatives, preferably with some restrictions on the mesh cells that trigger refinement (tn). The results in this experiment fall well in line with the conclusions in [Sky22]. The choice of the refinement strategy could be also seen as a model selection problem, following the concept described in Chap. 4.

## *5.4.2 Dealing with Missing Points and Voids: Trimming*

In Computer Aided Design (CAD), trimming is used to remove extra lines or extra parts of an object, see, e.g., Marussig and Hugues [Mar18] for an overview of methods in Isogeometric Analysis (IgA). Trimming aims to optimize the modeling and visualization of the approximated surface. Here we apply it to handle data gaps and "cut" the domains where no points were available for fitting. Often, this would have led to unfavorable ripples or voids as the algorithm tries to approximate without data support. We note that the parameterization and the mathematical description of the surface remain unchanged after trimming. We summarized the principle of trimming as follows:


Figure 5.14 explains the computation of trimming loops and the trimmed surface. A polygon of horizontal and vertical lines in the *x y*-plane surrounding the points is

**Fig. 5.14** Computation of the trimmed surface. **a** The point cloud (in khaki green) is recursively divided into subsets of the point cloud and bounded by polygons, **b** the composite polygons bounding the entire point cloud, **c** the polygons are approximated by a set of spline curves, **d** the final trimmed surface

computed in a recursive procedure. Depending on the density of the point cloud, a maximum recursion level is selected. A dense point cloud allows more recursions and consequently a more accurate polygon. The point cloud is recursively divided into blocks as shown in Fig. 5.14a. Here the maximum recursion level is two. The boundary lines of the blocks containing points are collected, while removing lines that occur twice. This happens when two adjacent blocks contain points. The resulting lines are sorted to create one or more polygons, see Fig. 5.14b. In Fig. 5.14c, the polygons are divided into pieces, each being approximated by a spline curve, and finally, in Fig. 5.14d, the trimmed surface is shown.

## **5.5 Conclusion**

Adaptive LR B-spline surface approximation is a flexible method to "transform data into information". Within a context of approximating geospatial data, huge, noisy and scattered data set from terrains or seabeds can be represented in a compact way. The surface approximation with LR B-splines has following advantages:


In this chapter, we have highlighted these properties and approximated a data set composed of seabed and terrain data recorded from sensors having different noise properties. More specifically:

1. We have compared different pre-processing strategies to eliminate outliers, and found that the method identifying single outlier points with no direct link to statistical methods hits the target best: it reduced the risk of eliminating features that need to be approximated but found real outliers.


The result of the surface approximation with LR B-splines is a mathematical surface with few coefficients in comparison to the huge number of points to approximate. The surface describes the underlying ground with high accuracy, which can be assessed by means of simple statistical quantities. Ongoing research tries to find the most optimal surface with respect to the data at hand by setting, e.g., the tolerance less empirically. To that aim, concepts developed in Chap. 4 can be used for smooth and homogeneous point clouds. In Chap. 6, we will present further applications of the LR B-spline surface approximation, such as deformation analysis with LR B-spline volume, or the drawing of contour lines from the mathematical model.

## **1 Appendix: Output Format and Source Code**

An LR B-spline surface is stored in an ASCII file using doubles for the storage of coefficients. It is also supported by Part 42 of ISO 10303 (the STEP standard). The export of the LR B-splines surfaces to other formats is crucial for further processing of the mathematical surfaces. Exemplary, raster is the standard representation for terrains and seabed in current GIS software. Unfortunately, this representation does not support the same level of detail as an LR B-spline surface of the same area in general. To circumvent that challenge, different possibilities exist:


4. A better option is to represent the LR B-spline surface by a collection of TP Bspline surfaces maintaining the feature of data size distributed according to needs. To that aim, the LR B-spline surface can be divided into TP B-spline surfaces by the means of dedicated knot line insertions. The division into TP B-surfaces is performed by a recursive algorithm. This division is also an ingredient in the computation of contour curves and some details are given in Chap. 6, Appendix 1.

Please note that for all computation, we made use of the GoTools library module LR Splines 2D. The source code is freely made available by SINTEF Digital, Department of Mathematics and Cybernetics for downloading at the link: https://github. com/SINTEF-Geometry/GoTools/wiki/Module-LRSplines2D.The hardware requirements are Windows, Linux or MacOS. The program language is C++. Following software are required: Cmake, Boost, and Qt for the viewer, which is used to visualize the approximated surfaces in this chapter.

## **References**


[Wan15] Wang, Y., & Feng, H.-Y. (2015). Outlier detection for scanned point clouds using majority voting. *Computer-Aided Geometric Design, 62*, 31–43.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 6 LR B-Spline Surfaces and Volumes for Deformation Analysis of Terrain Data**

**Abstract** Geospatial data acquisition of terrains with contact-free sensors such as Terrestrial or Airbone Laser Scanners generates scattered and noisy point clouds. Performing a surface approximation is an efficient way to reduce and structure the recorded point clouds. To that end, LR B-splines are attractive as they allow a local refinement, on the contrary to the tensor product B-spline and raster surfaces. By comparing the approximation error with a given tolerance, a local refinement is performed. We apply this adaptive refinement strategy to landslides data sets from Alpine terrain in Austria. We show how different epochs of the point clouds can be analyzed with LR B-spline volumes for spatio-temporal visualisation of deformation. We highlight the potential of a time-differenced LR B-splines volume for analysing geomorphological changes. A further application of this method is the drawing of contour lines.

**Keywords** GIS data set · Geospatial data set · LR B-splines · Adaptive surface fitting · Spatio-temporal deformation analysis · Contour lines · LR B-spline volumes · Geomorphological analysis

## **6.1 Introduction**

Terrestrial Laser Scanners (TLS) are contact-free measuring sensors. They record dense point-clouds of objects or scenes by acquiring coordinates of points and an intensity value; this latter depends, e.g., on the reflected surface or atmospheric propagation, see Wujanz et al. [Wuj17]. TLS range measurements can be either based on phase shift or time-of-flight. We refer to Vosselman and Maas [Vos10] or Pfeifer and Briese [Phe07] for more details. Typically, the range measurement in phase shift is more accurate than time-of-flight TLS but the maximum range is smaller. Due to its high scanning rate, it is not uncommon that a point cloud contain millions of points, which need to be processed in some way. Well-known software are, e.g., the freely available CloudCompare [CC22], or MeshLab [Cig08]. Within a geodetic context, prominent applications using TLS point clouds are deformation analysis for objects such as tunnels [Jia21], dams [Gon08] or bridges [Zog08], see also Mukupa et al. [Muk16]. Because TLS enables fast and precise mapping, they are currently used for monitoring forest canopy [Gri15] or landslides [Bar13].

The processing of huge point clouds can quickly become computationally unfavourable. Parametric spline surface approximation techniques address the challenge by reducing "data" into "information". The point clouds are efficiently compacted into a a manageable number of coefficients that are often estimated by least-squares (LS) adjustment. The mathematical modelization enables deformation analysis and rigorous statistical testing based on the estimated surfaces rather than on the original point clouds. Unfortunately, the Non Uniform Rational B-splines (NURBS, [Pie95]) do not allow for local refinement. This approximation method is unfavourable when the point clouds are scattered and noisy. Here the risk of overfitting should not be undertaken as it leads to unwanted ripples and oscillations in the approximated surface, see Bracco et al. [Bra18]. This latter may be confounded with deformations.

Starting from a first approximation of the point cloud using a coarse NURBS mesh, there exist three main approaches to perform an adaptive local refinement:


The remainder of this chapter is as follows: in a first step, we will shortly review the surface approximation with LR B-splines. The reader is referred to Chap. 2 and Chap. 3 for more details. We will introduce LR-B-spline volume as a promising tool for visualising and analysing spatio-temporal deformations. We will develop how contour lines can be computed from LR B-spline surfaces in a dedicated appendix. The chosen point cloud to illustrate the surface approximation is from the Alpine region in Austria.

## **6.2 Description of the Data Set**

In the context of climate changes and the expansion of areas of urban settlement, e.g., in Alpine regions, early warning system for risk management necessitates highquality data sets that are both spatially and temporally detailed. In this chapter, we perform spatio-temporal deformation analysis with mathematical approximations from a data set recorded in Austria, which will be shortly described in the following section.

**Fig. 6.1** Left: View of the region under consideration, Right: Visualization of one point cloud with the software CloudCompare

# *6.2.1 Deformation Monitoring with a Terrestrial Laser Scanner*

The data set studied in this chapter was recorded in the Valsertal region in Austria as part of HORIZON 2020 via the RFCS (Research Fund for Coal and Steel) funded research project *i* 2MON-"Integrated Impact MONitoring for the detection of ground and surface displacements caused by coal mining". Here we focus on the observations from the TLS (VZ-2000i, see http://www.riegl.com/nc/products/terrestrialscanning/produktdetail/product/scanner/58/), Further information about the experiment can be found in dedicated publications, e.g., Schröder and Klonowski [Sch20].

The monitoring of the Valsertal aims to analyse deformation with the help of a long-range TLS, for underlying safeguard applications. A continuous series of measurements are available from August 13, 2020 up to and including September 8, 2020. In this contribution, we have selected a small excerpt from August 20, 2020 to August 22, 2020. We focus on an area in the lower part of an alpine slope where some rearrangements of the facilities took place in this period. The measurements were made in a refuge opposite an area affected by a rock fall as shown in Fig. 6.1. A geodetic instrument called a total station was installed on one of the two measuring pillars in the hut measuring 16 reflectors at a distance of approximately 250–800 m every hour. On the second pillar, a laser scanner was used for the permanent installation. One point cloud of the whole area was recorded every two hours resulting in 36 point clouds in total. In order to separate the expected apparent deformations within the time series from the influences of the georeferencing, the laser scanner was expanded to include a platform with inclination sensors in the horizontal plane of the local scanner coordinate system. The GNSS antenna on the scanner was used for time synchronization. This set up allows us to highlight the potential of surface approximation with LR B-splines from TLS observations with the aim to visualize spatio-temporal deformation based on mathematical surfaces and volumes.

August 22th 1 AM August 22th 11 PM

**Fig. 6.2** Examples of the initial data sets

## *6.2.2 Data Set Preparation*

The point clouds differ in size, depending on the epochs when they were recorded. Figure 6.2 illustrates the three different extensions of the selected point clouds. We have harmonized the point clouds so that they all fit with the smallest extension shown at the bottom right. The long tail to the left is excluded from the approximation. Obvious outliers are removed using the method of Chap. 5, Sect. 5.3.2.3. We note that the data sets from the first epochs include a large tree that later disappeared. In the later part of the period, the vegetation was filtered out to gain in accuracy. Figure 6.3 shows two examples of the point clouds processed for our study, with and without the tree. The tree was intentionally kept in the first point clouds to spice the spatio-temporal analysis. This way, we can show the potential of LR B-spline volume to detect such changes, with applications in forestry inventory, see, e.g., Liang et al. [Lia6].

August 20th 11 AM August 21th 1 PM

**Fig. 6.3** Examples of the processed data sets

## **6.3 Surface Approximation of the Selected Data Set**

Approximating point clouds with tensor product (TP) B-spline surfaces does not allow for local refinement. LR B-splines are a way to locally refine the spline space and are shown to provide well-behaved mathematical surfaces for remote sensing applications, Skytt et al. [Sky22]. More specifically, an adaptive surface fitting is performed when the L1 norm—called in the following "error" or difference in absolute value—between the mathematical surface and the points inside a cell of the mesh exceeds a tolerance at a given step of the algorithm. The surface approximation obtained with LR B-splines depends on the tolerance from which the refinement is performed, the refinement method itself and the bidegree of the spline space. We have reviewed in Chap. 4 statistical methods to fix these parameters optimally.

# *6.3.1 General Principle of Adaptive Approximation Combining Least Squares and Multilevel B-Spline Approximation*

We parameterize the data points by their *x*- and *y*-values, and approximate the *z*values by an LR B-spline surface (function). The starting point of the iterative surface approximation with LR B-splines is a tensor product B-spline surface. Here a Bspline surface grid (also called also mesh) corresponds to the initial setting of the topology. Then an optimization is performed to compute the best LR B-spline surface corresponding to the initial mesh for approximating the point cloud. Once the initial surface is obtained, the error term between the mathematical surface and the points in the *z*-direction is computed.We consider the L1 norm, see Al-Subaihi et al. [AlS04]. If this value exceeds a given tolerance, a refinement is performed in the cells containing the corresponding points. The adaptive fitting is performed until a given number of iterations is reached or until no more error terms exceed the tolerance.

Extra degrees of freedom are inserted in the LR B-spline surfaces *locally*, where needed. The main advantage is that noise overfitting is avoided and the growth in data volume is limited, on the contrary to global fitting strategies where *all* cells are refined at *every* iteration steps. Clearly, with locally adaptive refinement methods, there will be cells that remain unchanged after a given iteration step because the error term is smaller than the chosen tolerance or there is no point in the cell. Intuitively, few cells are refined with a large tolerance whereas, all cells will be divided for a very low tolerance.

In the first iteration steps, the fitting is made using the LS method, i.e., the L2 norm or Euclidean distance between the parametrized point cloud and the parametric surface is minimised. As the number of refinement steps increases, it is favourable to switch to the Multilevel B-spline Approximation (MBA) developed by [Lee97], see also [Sky15] and Chap. 3 for more details on the procedure. For the MBA strategy, no equation system is solved as the coefficients are computed locally and explicitly: the residuals of the data points obtained from the last fitted surface are recursively approximated using finer meshes.

## *6.3.2 Approximation of the Selected Data Set*

In this section, we present the results of the approximation of the domain under consideration, see Fig. 6.4.

#### **6.3.2.1 Goodness of Fit**

The goodness of fit is assessed using the usual criteria, as described in Chap. 3:


#### **6.3.2.2 Dealing with Outliers and Voids**

Voids and outliers are typical challenges related to geospatial data, which will be shortly addressed in this section.

#### **Voids**

The point clouds contain voids which come from the scanning configuration and generate domains where no points are recorded. Data gaps challenge the approximation as the surface will try to extrapolate the known information in these areas, creating artificial oscillations. This drawback is particularly emphasized when LS approximation is used. MBA is more robust due to its explicit formulation and its similarity

**Fig. 6.4** Point cloud with about 1 million points acquired at 9 AM August 21st

with a L1 norm fitting, see Kermarrec and Morgenstern [Ker22] for a discussion. To face that challenge, we apply a 2-steps procedure:


#### **Outliers**

The point clouds under consideration contain noise and artifacts. These latter need to be removed prior to the surface approximation in order to avoid the fitting of outliers, which is unfavorable for further interpretation of the results.

Different methods exist to detect outliers. They are reviewed and applied in Chap. 5, Sect. 5.3.2.

Following strategies can be applied besides eliminating outliers:


The data sets used in this chapter is cleaned for obvious outliers using the last procedure described in Chap. 5, Sect. 5.3.2.3.

# *6.3.3 Visualization of the Approximation at Different Iteration Steps*

In this section, we will present some results of the surface approximation with the aim of being didactic and visual to improve the understanding of iterative surface approximation with LR B-splines.

#### **6.3.3.1 The Adaptive Surface Approximation, Step by Step**

A tolerance of 0.5 m is selected for the approximation of the selected point cloud. If a smaller tolerance is used, the risk of overfitting increases, which is unfavorable for a smooth and reliable surface fitting. A higher tolerance leads to a convergence after only few iterations, without the advantages of local refinement being visible, i.e., the approximation remains coarse.

We start the approximation with an initial biquadratic TP B-spline surface of 10 times 10 coefficients to approximate the point cloud. This corresponds to iteration 0 as shown in Fig. 6.5. Then the algorithm is allowed to run for 7 iterations, ending as shown in Fig. 6.12. The final surface output is shown in Fig. 6.12. We additionally draw the trimmed surface with respect to the point cloud domain (Fig. 6.13). Here the domains with no point are "cut" so that no unwanted effects such as, e.g., oscillations or drops occur in the domain without any observation.

#### **6.3.3.2 Goodness of Fit**

The results of the approximation are summarized in Table 6.1. Combined with Figs. 6.5, 6.6, 6.7, 6.8, 6.9, 6.10, 6.11 and 6.12, the impact of increasing the number of iteration steps is visible: after 5 iterations, the MAE does not decrease significantly (from 0.095 to 0.083 m), but the maximum distance does (from 0.095 to 0.083 m). This difference highlights the main advantage of local refinement: only specific domains are approximated locally with a higher accuracy by letting the main part of the point cloud untouched. The meshes allow to visualize more clearly the property of the iterative local refinement, although it is hardly possible to distinguish the differences in the last two iterations. Here the increase in the number of coefficients between iteration 5 and 7 (from 5700 to 14,000) is more descriptive (Table 6.1).

Figures 6.5, 6.6, 6.7, 6.8, 6.9, 6.10, 6.11 and 6.12 further highlight how the original coarse mesh from the first iteration is refined at each step. The number of meshlines, and so the number of coefficients, increases to account for local details *only*. Figure 6.13a shows where the points outside tolerance are located on the final surface after 7 iterations. The underlying final mesh is depicted additionally. There remain

**Fig. 6.5 a** Initial surface, **b** points coloured according to the distance to the surface, cell boundaries, **c** LR mesh (TP mesh), **d** colour scheme used for coloured point clouds in this figure and the subsequent figures

**Fig. 6.6 a** Surface after the first iteration, **b** points coloured according to the distance to the surface, cell boundaries, **c** LR mesh

**Fig. 6.7 a** Surface after the second iteration, **b** points coloured according to the distance to the surface, cell boundaries, **c** LR mesh

**Fig. 6.8 a** Surface after the third iteration, **b** points coloured according to the distance to the surface with cell boundaries, **c** LR mesh

domains where the approximation cannot be further improved: Table 6.1 shows that increasing the number of iterations does not lead to a smaller MAE after the sixth iterations. Indeed, the number of points outside tolerance decreases but in % of the total number of points, the difference is irrelevant (99.3 vs. 99.1% between the sixth and the sevenths iteration). At the same time, the number of coefficients increases from 9000 to 14,000. The computational time increases as well, but remains at a moderate level (a few sec). Thus, when searching for an optimal surface approximation, a balance has to be found between the number of iterations, the MAE and its relevance, the maximum distance and the computational time for a given tolerance. This choice is let to the practitioner which should judge the risk of fitting the noise as the number of iterations increases. An indication can be provided by searching the

**Fig. 6.9 a** Surface after the fourth iteration, **b** points coloured according to the distance to the surface, cell boundaries, **c** LR mesh

**Fig. 6.10 a** Surface after the fifth iteration, **b** points coloured according to the distance to the surface with cell boundaries, **c** LR mesh

minimum of AIC, as describe in Chap. 4. In our particular case, a minimum could be found after the eighth iteration. We point out that the AIC gives an indication about the turning point from which further refinement is not leading to a strong decrease of the root mean square error with respect to the increase of coefficients. It is a global criterion, that does not provide information about the local adjustment.

**Fig. 6.11 a** Surface after the sixth iteration, **b** points coloured according to the distance to the surface, cell boundaries, **c** LR mesh

**Fig. 6.12 a** Final surface, **b** points coloured according to the distance to the surface, cell boundaries, **c** LR mesh

**Fig. 6.13 a** Points with a distance larger than the tolerance (0.5m), **b** surface trimmed with respect to the point cloud domain (trimming with respect to the outer boundary only)


**Table 6.1** Adaptive approximation of the point set shown in Fig. 6.4

The maximum and average distance for each iteration level is reported along with the number of points with a distance larger than the tolerance, the percentage of resolved points and the number of surface coefficients. Spatial units are m and computation time is given in s

# **6.4 LR Spline Volumes to Analyse Spatio-temporal Deformation**

The domain under consideration was scanned every two hours during three consecutive days. This leads to a large amount of point clouds, making the use of surface and *volume* approximations relevant to visualize and analyze the deformations or changes that may occur during that time. The main advantage is not having to work with the noisy and scattered point clouds. This is computationally advantageous and allows for a simpler interpretation.

## *6.4.1 Principle of Volume Approximation*

To get an impression on the continuous development of the landscape in the selected area, the time component is added as a third parameter direction in the data set allowing an interval of 0.5 between each set in the time direction. Figure 6.14 shows the structure of the composed raw point clouds, before the volume approximation. The block of point clouds is narrow in the time direction compared to the space directions, and the time layers are distinguished by colour. The distances between these layers are larger than the distances between points in the *x y*-plane, but still small enough to control the behaviour of the spline volume approximation. The height range corresponding to the different epochs is very dependent of the presence of the aforementioned tree, i.e., the data set shown in Fig. 6.3a has a range of [−42*.*69*,* 17*.*77] while the range for the points in (b) is [−42*.*7*,* 8*.*55]. The total height range in including all point sets is [−42*.*71*,* 17*.*89].

A point cloud assembled from all epochs is approximated by an LR B-spline volume. Following Sect. 6.3.3, we apply a tolerance of 0.5 m and perform 7 iterations. Figure 6.16 shows the corresponding LR volume. The visualization is performed with

**Fig. 6.14** The structure of the volume *point cloud*. The points are represented by their *x*-, *y*- and time coordinates and points from different acquisitions are distinguished by colours

a dedicated viewer as proposed in [Fuc17]. The colours are linked with elevation. Here we *do not* perform a surface approximation of each of the 36 point clouds independently and individually, as shown in the previous section: we approximate the block of point clouds as a whole.

The trivariate point cloud is approximated by an LR B-spline volume following the same approach as for surface approximation explained in Chap. 3.


At each iteration step, an updated approximation using MBA is computed. The iteration stops when the given tolerance is met or a maximum number of iteration steps is applied. Figure 6.15 shows a mesh corresponding to a triquadratic LR Bspline volume with initially three inner knots in each parameter direction after one iteration. The refinement is performed only in the first parameter direction and the inserted mesh rectangles are highlighted with yellow colour.

We will refine in all three parameter directions simultaneously. The maximum distance between the point cloud and the volume after 7 iterations is 29 m. The most distant points are associated to the tree, which cannot be well fitted by a smooth surface. The average distance is 0.17 m and 1,507,346 out of 37,785,650 points have a distance to the volume larger than the tolerance of 0.5 m, meaning that 96% of the points are within the resolution. The maximum distance is kept at approximately the same level throughout the computation indicating a feature unsuitable to be fitted

**Fig. 6.15** Simple LR volume mesh with mesh rectangles

**Fig. 6.16** LR B-spline volume approximating the trivariate point cloud. The height field is represented with colours

with a smooth volume, while the average distance is gradually reduced. The final number of coefficients is 93,829. The computational time is 10 min and 7 s excluding file operations, which is manageable from a practitioner perspective.

The domain of the LR B-spline volume approximation to the point cloud corresponds to the axis parallel bounding box surrounding the parameter points {*xi, yi, ti*}*<sup>N</sup> <sup>i</sup>*=<sup>1</sup> where *N* is the total number of points. As the boundaries of the point cloud do not adapt to axis parallel lines, for some parts of the volume shown in Fig. 6.16 there are no corresponding data points. The figure visualizes the height field corresponding to the point cloud with colours. The volume is cut at the position

**Fig. 6.17** LR-spline volume. Cutting plane in the x-direction gives some indication of landscape changes

of the tree, which is present in the point clouds at the beginning of the acquisition period only. The tree can be recognized in the volume approximation and is marked by a circle in the figure.

The presence or absence of the tree is the largest difference between the different data sets and the transition from tree to no-tree is totally dominant in the LR volume description of the landscape throughout the three days of acquisition. Some smaller differences between the data sets can be distinguished in the volume, see the marked area in Fig. 6.17. The corresponding marks is weak: to get clearer indications of change, we turn to the derivative of the volume in the time direction.

## *6.4.2 Volume Changes in the Time Direction*

A partial derivative of a polynomial TP B-spline volume is a polynomial TP B-spline volume with the polynomial degree decreased by one in the direction of differentiation, and very simple formulas exist for computing the derivative. As a spline volume the partial derivative of an LR B-spline volume in some direction is also an LR B-spline volume, but the local structure of the LR B-spline volume implies that the differentiation procedure becomes complex. In the following, we represent, for each cell, the field as a spline volume without inner knots and differentiate cell by cell.

#### **6.4.2.1 Partial Derivative of the Volume for Spatio-temporal Analysis**

A non-zero derivative of the height field reveals changes in height: by focusing on the time direction only, we exclude landscape formations that do not change over time.

Figure 6.18 shows changes in time revealed by the time derivative of the LR B-spline volume. The visualization focuses on derivatives within a [−3*,* 3] range. Higher values for the derivative field exist, but are not highlighted explicitly. The field is hidden for derivatives close to zero. We see that changes in the height field

**Fig. 6.18** The derivative of the height field in the time direction represented as an LR B-spline volume

**Fig. 6.19** The derivative of the height field in the time direction just before August 20th at 11 PM (left) and just before 5 AM the same day (right)

occurs as blobs in the total volume. The figure presents an overview of the volume (top) and looks into it from the *x*-direction (bottom). The blobs of change are local in the time direction. They corresponds to (i) an object that appears and disappears or (ii) a modification of the landscape that is later left untouched. The spot marked with "A" represents the tree that is present in the first part of the time line. "B" is outside the point cloud and indicates that the height field is gradually decreased after removal of the tree. "C" and "D" are areas of interest to be discussed in the following.

Figure 6.19 focuses on the area marked with "C" in Fig. 6.18. Here it is marked with a circle. Green colour means no change in time, red means that the height increases while blue means that it decreases. The activities performed in period of time in this area are discussed in further detail in Sect. 6.4.3.

In Fig. 6.20 the area "D" is marked with an ellipse. In the cut to the right, we can see from the colours that the height field increases and then decreases again. In

**Fig. 6.20** The derivative of the height field in the time direction just after August 21st at 11 PM (left) and a cut through the volume in y-direction hitting area "D" specified in Fig. 6.18

**Fig. 6.21** The derivative of the height field in time direction at August 20th at 11 PM and corresponding differences in the point elevation, raster view

between, there is a narrow green strip indicating that the observed object, probably an excavator, is present. The incident to the right happens at an earlier point in time than the one to the left (right picture). This indicates a movement of the excavator from position "1" to position "2".

#### **6.4.2.2 Spatio-temporal Changes Visualized as Medical Images**

In Fig. 6.21, a visualization dedicated to medical images [ITK] is used to give an alternative view of the height changes. The volume is represented as a 255 × 255 × 255 raster and the values are computed by averaging a number of sample points in the raster cells. The views in "a" and "d" show the same snapshot in time for different features of the volume. The derivative of the height field in the time direction for August 20th 11 PM is shown in Fig. 6.21, "a". Views "b" and "c" show cuts through the volume with constant y and constant x, respectively. The three views are connected by the blue cross. In "d" the average difference in point elevation for each cell is shown. The circle indicates the position of the tree in the views where it is present. Views "a" and "d" indicates that two objects of significant size are situated close to the blue cross and there are some smaller additional modifications of the elevation in the vicinity of the cross. The duration of the elevation changes can be seen from view "b" and "c" where the development in time is shown in the vertical direction. This reveals that the highlighted elevation changes were temporary. Note that a permanent change in elevation will appear only as a limited white spot in view "b" and "c". View "d" can be used to place an incidence in the landscape as the intensity map gives an indication of the terrain. White indicates steep areas or rapid landscape changes in time, while plains and areas without points are black.

Figure 6.22 provides another visualization of the situation in Fig. 6.20. The descent from the tree is completed as can be seen in view "b". Note that the derivative has its largest values when the volume adapts to a change and not when the peak or dump is at its largest. The time of the ring is when the tree is no longer present. The two positions of the excavator are shown in views "a" and "b". Due to the position of the cut, the appearance and disappearance of only one excavator are visible in "c".

## *6.4.3 Difference of Surfaces*

In this section, we propose to analyse more specifically changes in the point cloud with LR B-splines surfaces. Here we look at the surface approximation of the point clouds acquired at 11 PM and 5 AM at August 20th, see Fig. 6.23. The maximum distance between the points and the surface is 28.58 m for (a) and 28.30 m for (b). The average distances are 0.178 and 0.179 m, and the number of points with a distance larger then 0.5 m is 25,583 and 24,801. The numbers of data points are 1,071,938 for (a) and 1,068,546 for (b). The main obstacle for an accurate approximation is the tree in the upper left corner, but also some excavators in the right half of the figures are impossible to represent exactly with a smooth surface. Some differences between Fig. 6.23a and b can be identified, but the general impression is that there is little difference in the landscape between the two epochs. Figure 6.24 reveals some more details. Here the difference surface between the two surfaces in Fig. 6.23 is computed and represented as an LR B-spline surface, *difference surface = surface b - surface a*. The surface is trimmed according to the point cloud at 11 PM. Contour

**Fig. 6.22** The derivative of the height field in time direction at August 21st at 11 PM and corresponding differences in the point elevation, raster view

curves are computed for every 0.25 m between −5 and 5 m. Some details concerning the computation of these curves are given in Appendix 1. The green curves visualize material that is removed from (a) to (b) and red curves material that is added. The black curves show the zero level for the difference surface. Most of the surface is oscillating slightly below and slightly above zero. This is an effect of differences in the point clouds and the approximation error and does not represent a change in the landscape. The red and green curves in most cases represent change. At the point marked with "A" is the aforementioned tree and the difference here is not due to a real change. At "B", an excavator is added an at "C" one is removed. Snow is removed at "E" and moved to "D". Some local changes in the snow cover take place at "F". This short analysis highlights the potential of surface approximation to analyse deformation with application for geomorphological analysis. We refer to [And21] for an example based on the noisy and scattered point clouds.

**Fig. 6.23** The situation at 11 PM (**a**) and 5 AM (**b**) at August 20th visualized as surfaces

**Fig. 6.24** The difference between the situation at 11 PM and 5 AM at August 20th visualized as a surface with associated contour curves for every 25th cm between −5 and 5m, green curves represent negative levels, red curves positive and black the zero level

## **6.5 Conclusion**

We have presented a local adaptive refinement strategy to approximate efficiently scattered and noisy point clouds from TLS. Prominent applications are deformation analysis or monitoring, without having to manipulate or filter a huge amount of data. To that end, we have used LR B-spline surfaces, which were shown to be well adapted to fitting terrains and seabeds. This approach refines the point clouds locally, avoiding the computation of unnecessary surface coefficients: the output is a compact surface in a short amount of time. This mathematical representation is favorable for further analysis of the point cloud; the noise is filtered out, and voids can be handled efficiently with CAD techniques such as trimming. The approximation method is based on a combination of LS, to which a smoothing term can be added in the first iteration steps, and MBA. Outliers are to be eliminated prior to the surface approximation. A classification can be performed in advance to eliminate, e.g., trees or cars if only the ground is of interest for deformation analysis.

We have applied the algorithm to TLS point clouds recorded in the Alpine region in Austria. The domain under consideration was scanned every two hours during three consecutive days. This large amount of data allows a visualization of change pattern from the mathematical approximations, without having to manipulate the original point clouds. To that end, we have introduced the LR B-spline volume and its derivative as a possibility to visualize spatio-temporal changes. The story of the point clouds could be guessed, paving the way for new applications of surface approximation within a GIS context. We have used images inspired by medical applications to visualize and analyse geomorphological changes. These examples highlight the potential of combining different visualization techniques to extract spatio-temporal information from a high number of point clouds.

The source codes to perform the approximation with bivariate (lrsplines2D) and trivariate (lrsplines3D) LR B-splines are made available by SINTEF Digital, Department of Mathematics and Cybernetics for downloading at the link: https://github.com/SINTEF-Geometry/GoTools.

The hardware requirements are Windows, Linux or MacOS. The program language is C++. Following software are required: Cmake, Boost, Qt for the viewer used to visualize the approximated surfaces in this chapter.

## **1 Appendix: Contour Curves**

In Fig. 6.24, we showed contour curves corresponding to the underlying surface. The calculation of contour curves is supported in all GIS systems. For LR B-spline surfaces, contour curves are curves where the value of the spline function is constant.

To compute the contour curves, we search for curves *fa (t)* = *( f*1*(t), f*2*(t))<sup>T</sup>* ∈ *R*<sup>2</sup> such that *F( f*1*(t), f*2*(t))* = *a* for an LR B-spline surface *F* and an elevation value *a*. To that end, we split the LR B-spline surface into a number of TP B-spline surfaces. The division into TP B-spline surfaces is performed by a recursive algorithm. At each level, we consider how the current surface can be split by extending one meshline to cover the entire surface domain. The candidate meshline must contain T-joints, i.e., at least one meshline in the other parameter direction must end at this meshline. The number of surface elements overlapping the meshline extension should be minimized and at the same time the meshline should divide the current surface into two surfaces with roughly the same number of knots. The balance between the two criteria varies throughout the recursion levels. When an appropriate split is found, the algorithm proceeds to look for splits in the two sub-surfaces. The splitting algorithm stops when no sub-surface contains more meshlines that don't traverse the surface domain than a given threshold. Each sub-surface is expanded to a TP B-spline surface by adding missing mesh line segments.

Figure 6.25 illustrates the division of the difference surface presented in Sect. 6.4.3 into TP B-spline surfaces. Our aim is to study the computation of contour curves with zero height with some detail.

We use the interrogation functionality of SINTEF's spline library, SISL [Dok21] on each sub surface after the LR B-spline surface is split into TP B-spline surfaces.

**Fig. 6.25** Division of LR B-spline surface into a set of TP B-spline surfaces. **a** the initial surface, **b** the collection of TP B-spline surfaces distinguished by colour, the trimming curves corresponding to the LR B-spline surface are shown in black, **c** the LR mesh corresponding to the surface in **a**, **d** the mesh corresponding to the collection of TP B-spline surfaces

**Fig. 6.26** Computation of contour curves. **a** Complete set of contour curves with elevation zero (blue curves), one TP B-spline surface is highlighted for further study, **b** guide points from the first part of the computation (red) with connections between them, **c** points generated by tracing the contour curves (green) and the final curves

The contouring problem corresponds to computation of intersections between a parametric spline surface and an algebraic surface, a problem that is discussed in [Pat02].

The applied algorithm can be divided into three parts:

	- (a) Compute the topology of the contour curves using SISL. This is a recursive algorithm that finds a set of "guide points" on each curve branch.
	- (b) Trace each identified curve branch starting from an identified "guidepoint". Represent the curves traced out as spline curves.

An LR B-spline surface approximating an area with large shape variations will contain many details, which again will lead to a complex pattern of contour curves. Efficiency and robustness of the algorithm is reached through good interception methods and a clever strategy for dividing the surface into subsets. A discussion on subdivision strategies for surface intersections can be found in [Dok07]. A general rule is to subdivide at singularities and internal in closed loops. A complex situation leads to more subdivisions and consequently more guide points.

Figure 6.26 illustrates the computation of the contour curves. The red guide points in (b) are found at boundaries between sub surfaces. In such a complex situation, several recursion levels are required to be able to separate the different branches of the contours and ensure that no more closed contour curves exist. The last sub surface domain is shown in the upper right corner of the picture. All coefficients of the corresponding TP B-spline surface are negative. Thus, there is no possibility of a contour curve in this area and the computation can be finalized.

Given information about all contour curve branches in the area of interest, the curves can be drawn. Here the objective is to describe the curve with sufficiently accuracy, handle sharp turns in the curve and avoid jumping to a different contour curve. A marching procedure is applied. Given one point on the curve, a guess for the next point is made. The new point is moved to the contour curve and the segment between the two points is checked for consistence. The distance between the points is diminished if necessary. Figure 6.26c shows the tracing results. The density of the points is increased at sharp corners and when two curves pass within a small distance. Fragments of the contour curves are computed separately for each sub surface and the final step is to merge curve fragments across subset boundaries.

## **References**


**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.

# **Chapter 7 Conclusion**

In this SpringerBrief, we went through the mathematical concepts of LR B-splines in all its facets, explaining in details how the spline space can be refined and the different strategies for adaptive approximation. Through detailed example with seabed and terrain data sets, we have highlighted how adaptive surface approximation of various noisy and scattered point clouds is performed concretely. We showed how to deal with challenges raised when working with real data set such as voids or outliers. We presented numerous applications that can be derived from "transforming data to information". More specifically, we reviewed:


We have illustrated the principle of surface approximation with various examples, using data from terrestrial laser scanner, sonar, and terrain or seabed data set. We have proposed the LR B-spline surface and volume as a new and promising format for representing noisy and scattered point clouds in a compact form. This approximation method provides a middle road between the rigid, but effective regularity of the raster format and the large flexibility of triangulated surfaces. LR B-spline surfaces are smooth and can, due to their adaptive potential, represent local detail without a drastic increase in data size. For point clouds coming from sensors having a very

109

high data rate and containing millions of points, the computational time of the surface approximation stays manageable: This is a strong argument for a wide acceptance of the local iterative fitting to represent scattered and noisy data mathematically. The surfaces can be exported as rasters in various resolutions as well as collections of tensor product spline surfaces. All software are freely available to promote usage.

# **7.1 A Promising Application: The LR B-Spline Volume for Spatio-temporal Analysis of Geomorphological Changes**

The analysis of spatio-temporal deformation is one of the most promising applications of LR B-spline surfaces and volumes. These latter will provide a framework for analysing geomorphological changes without having to work on noisy and scattered point clouds of different quality, coming from different sensors. Detecting and analysing the movements of, e.g., sand dunes and snow masses is a potential use of LR B-spline volumes to gain a better understanding in the underlying geophysical processes. Similar applications are conceivable within the context of geodetic deformation analysis of structures, such as tunnels or dams. Advanced methods could take features such as extremal points, ridges and valleys into account. Outliers should be efficiently identified during the computation of the approximating surface and removed during the iterative algorithm presented. The outlier problematic is an important topic for further work and has been addressed in the present SpringerBrief.

## **7.2 Ongoing Research**

Ongoing research focuses on identifying the most optimal surface. New criteria could be investigated to judge the quality of the approximation by accounting for the sensor noise, setting a tolerance and a stop criterion for the number of iterations in the approximation algorithm. We have introduced this challenge within the context of model selection, but alternative strategies should be developed. They could depend on the application at hand, as a balance between computational time, needed accuracy of the surface and overfitting avoidance. The detection and analysis of deformation or changes between two or more surface approximations or inside an LR B-spline volume remain an open topic, for which the definition of distance between mathematical surfaces should be further investigated. This latter should account for both the uncertainty of the measurements and the surface/volume approximation.

Last but not least, we expect that the capacity of LIDAR and sonar type data acquisition technology will continue to grow in the next decades. This makes in even more urgent to turn the enormous more or less structured point clouds into mathematically structured information that can be efficient interrogated and analysed.

#### 7.2 Ongoing Research 111

We believe that representations that allow the granularity of the representation to adapt to the local behaviour of the underlying information intrinsic in point clouds are essential in this respect.

**Open Access** This chapter is licensed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license and indicate if changes were made.

The images or other third party material in this chapter are included in the chapter's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the chapter's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.